Sie sind auf Seite 1von 19

PHYSICAL REVIEW B VOLUME 56, NUMBER 20 15 NOVEMBER 1997-II

Maximally localized generalized Wannier functions for composite energy bands


Nicola Marzari and David Vanderbilt
Department of Physics and Astronomy, Rutgers University, Piscataway, New Jersey 08855-0849
~Received 10 July 1997!
We discuss a method for determining the optimally localized set of generalized Wannier functions associ-
ated with a set of Bloch bands in a crystalline solid. By ‘‘generalized Wannier functions’’ we mean a set of
localized orthonormal orbitals spanning the same space as the specified set of Bloch bands. Although we
minimize a functional that represents the total spread ( n ^ r 2 & n 2 ^ r& 2n of the Wannier functions in real space, our
method proceeds directly from the Bloch functions as represented on a mesh of k points, and carries out the
( k…
minimization in a space of unitary matrices U mn describing the rotation among the Bloch bands at each k point.
The method is thus suitable for use in connection with conventional electronic-structure codes. The procedure
also returns the total electric polarization as well as the location of each Wannier center. Sample results for Si,
GaAs, molecular C 2 H 4 , and LiCl will be presented. @S0163-1829~97!02944-5#

I. INTRODUCTION that correspond more closely to the chemical ~Lewis! view of


molecular bond orbitals. It seems not to be widely appreci-
The study of periodic crystalline solids leads naturally to ated that these are the exact analogues, for finite systems, of
a representation for the electronic ground state in terms of the Wannier functions defined for infinite periodic systems.
extended Bloch orbitals c nk„r), labeled via their band n and Various criteria have been introduced for defining the local-
crystal-momentum k quantum numbers. An alternative rep- ized molecular orbitals,21–24 two of the most popular being
resentation can be derived in terms of localized orbitals or the maximization of the Coulomb23 or quadratic24 self-
Wannier functions w n (r2R), that are formally defined via a interactions of the molecular orbitals. One of the motivations
unitary transformation of the Bloch orbitals, and are labeled for such approaches is the notion that the localized molecular
in real space according to the band n and the lattice vector of orbitals may form the basis for an efficient representation of
the unit cell R to which they belong.1–4 electronic correlations in many-body approaches, and indeed
The Wannier representation of the electronic problem is this ought to be equally true in the extended, solid-state case.
widely known for its usefulness as a starting point for vari- One major reason why the Wannier functions have seen
ous formal developments, such as the semiclassical theory of little practical use to date in solid-state applications is un-
electron dynamics or the theory of magnetic interactions in doubtedly their nonuniqueness. Even in the case of a single
solids. But until recently, the practical importance of Wan-
isolated band, it is well known that the Wannier functions
nier functions in computational electronic structure theory
w n (r) are not unique, due to a phase indeterminacy e i f n (k) in
has been fairly minimal. However, this situation is now be-
ginning to change, in view of two recent developments. First, the Bloch orbitals c nk(r). For this case, the conditions re-
there is a vigorous effort underway on the part of many quired to obtain a set of maximally localized, exponentially
groups to develop so-called ‘‘order-N’’ or ‘‘linear-scaling’’ decaying Wannier functions are known.2,27
methods, i.e., methods for which the computational time for In the present work we discuss the determination of the
solving for the electronic ground state scales only as the first maximally localized Wannier functions for the case of com-
power of system size,5 instead of the third power typical of posite bands. Now a stronger indeterminacy is present, rep-
conventional methods based on solving for Bloch states. resentable by a free unitary matrix U (k)
mn among the occupied
Many of these methods are based on solving directly for Bloch orbitals at every wave vector. We require the choice of
localized Wannier or Wannier-like orbitals that span the oc- a particular set of U (k)
mn according to the criterion that the sum
cupied subspace,6–14 and thus rely on the localization prop- V of the second moments of the corresponding Wannier
erties of the Wannier functions. Second, a modern theory of functions be minimized. ~This is the exact analogue of the
electric polarization of crystalline insulators has just recently criteria of Boys24 for the molecular-orbital case.! We show
emerged;15–20 it can be formulated in terms of a geometric that V can be decomposed into a sum of two contributions.
phase in the Bloch representation, or equivalently, in terms The first is invariant with respect to the U (k)
mn and reflects the
of the locations of the Wannier centers. k-space dispersion of the band projection operator, while the
The linear-scaling and polarization developments are at second reflects the extent to which the Wannier functions fail
the heart of the motivation for the present work. However, to be eigenfunctions of the band-projected position opera-
there is another motivation that goes back to a theme that has tors. We show how this formulation reduces to previous ones
recurred frequently in the chemistry literature over the last in the case of a single isolated band, or in one dimension, or
40 years, namely, the study of ‘‘localized molecular for centrosymmetric crystals.
orbitals.’’21–26 The idea is to carry out, for a given molecule We also describe a numerical algorithm for computing the
or cluster, a unitary transformation from the occupied one- optimally localized Wannier functions on a k-space mesh.
particle Hamiltonian eigenstates to a set of localized orbitals The algorithm is designed to operate in a post-processing

0163-1829/97/56~20!/12847~19!/$10.00 56 12 847 © 1997 The American Physical Society


12 848 NICOLA MARZARI AND DAVID VANDERBILT 56

mode after a conventional band-structure calculation, taking composite group if they are connected among themselves by
as its input the Bloch functions computed on a mesh of k degeneracies, but are isolated from all lower or higher bands.
points. ~Thus, it is not a linear-scaling method.! We present For example, in Si the four valence bands form a composite
sample results for the optimally localized Wannier functions group, while in GaAs the lowest valence band is isolated and
in Si, GaAs, molecular C 2 H 4 , and LiCl. It should be empha- the higher three form a composite group.
sized that this procedure generates incidentally a set of In the case of isolated bands, it is natural to define Wan-
Wannier-center positions; these by themselves can some- nier functions individually for each band. That is, the Wan-
times be very useful for analyzing the bonding properties and nier function for band n ~together with its periodic images!
the electronic polarization of disordered or distorted insulat- spans the same space as does the isolated Bloch band. In the
ing materials. case of composite bands, however, it is more natural to con-
In this work, we have not considered any further gener- sider a set of J ‘‘generalized Wannier functions’’ that ~to-
alizations of the problem, although several interesting possi- gether with their periodic images! span the same space as the
bilities come to mind. For example, one could relax the con- composite set of J Bloch bands. That is, the ‘‘generalized
straint that the Wannier functions should be orthonormal to Bloch functions’’ c nk that are connected with the nth gener-
each other ~in this case they should probably not be called alized Wannier function will not necessarily be eigenstates
‘‘Wannier functions’’!. Such functions would correspond to of the Hamiltonian at this k, but will be related to them by a
the ‘‘localized orbitals’’ or ‘‘support functions’’ appearing in J3J unitary transformation.
certain linear-scaling methods6,10,13 and in the chemical- The formulation that follows is designed to apply equally
pseudopotential approach.28–30 Alternatively, one could re- to the isolated and composite cases. For the isolated case,
tain the orthonormality requirement, but ask to find a larger J51, and sums over n can be ignored. For the composite
set of functions spanning a space containing the desired case, the terms ‘‘Bloch function’’ and ‘‘Wannier function’’
bands as a subspace. For example, in Si one could ask for a should be understood to be meant in the generalized sense
maximally localized set of four Wannier-like functions per discussed above.
atom spanning a space twice as large as, but containing, the It may sometimes be convenient to consider a group of
space of the four occupied valence bands.4,31 Again, this is bands as composite even when some of the members are
very similar to what is done in certain linear-scaling actually isolated. For example, one may wish to consider all
methods.10,12,13 These interesting generalizations deserve in- of the occupied valence bands of an insulator as a composite
vestigation, but have not been pursued here. group. This is rather natural in connection with linear-scaling
The manuscript is organized as follows. The problem is algorithms and the theory of electronic polarization. Thus,
introduced in Sec. II. Expressions for the spread functional, for GaAs, one may choose to regard all four valence bands as
and for its decomposition into gauge-invariant and gauge- a composite group. In this case the Wannier functions will
dependent parts, are developed first in real space in Sec. III. resemble s -bonded pairs of sp 3 hybrids, arguably the most
Section IV then formulates the corresponding expressions in natural choice. Moreover, the GaAs Wannier functions de-
discrete k space ~that is, on a mesh of wave vectors!. Special fined in this way turn out to be considerably more localized
features that arise in one dimension, or for a single isolated than those of the top three or bottom valence bands sepa-
band, or for a crystal with inversion symmetry, are also dis- rately. Again, the formulation below should be taken to ap-
cussed there, as is the steepest-descent minimization algo- ply equally to this case, with n running over the J adjacent
rithm that we use. Some discussion and speculation about the bands that are being considered as a composite group.
asymptotic localization properties, and the real versus com- Finally, the formalism applies equally to any isolated
plex nature of the Wannier functions, appear in Sec. V. In band or composite group that may exist in a metal or insu-
Sec. VI we present test results for Si, GaAs, C 2 H 4 , and LiCl lator, regardless of occupation. However, because the expec-
systems. Finally, in Sec. VII, we discuss the significance of tation values of physical operators only depend upon occu-
the work, emphasizing possible applications of our approach. pied states, one is usually interested in the case of occupied
Some details of the real-space, discrete k-space, and continu- bands in insulators.
ous k-space formulations are deferred to Appendixes A, B,
and C, respectively. In particular, the relationship of the B. Definitions
present work to the theory of adiabatic quantum phases and
quantum distances is discussed in Appendix C. We denote by w n (r2R) or u Rn & the Wannier function in
cell R associated with band n, given in terms of the Bloch
functions as
II. PRELIMINARIES
A. Isolated and composite bands
We confine ourselves here to the case of an independent-
u Rn & 5
V
~ 2p !3
E dke 2ik•Ru c nk& , ~1!

particle Hamiltonian H5 p 2 /2m1V(r) with a real periodic so that


potential V(r). We thus assume the absence of electric and
magnetic fields, and we suppress spin. The eigenfunctions of
H are the Bloch functions c nk(r) labeled by band n and u c nk& 5 (R e ik•Ru Rn & . ~2!
wave vector k.
A Bloch band is said to be isolated if it does not become Here V is the real-space primitive cell volume. It is easily
degenerate with any other band anywhere in the Brillouin shown that the Wannier functions form an orthonormal set.
zone ~BZ!. Conversely, a group of bands are said to form a As usual, the periodic part of the Bloch function is defined as
56 MAXIMALLY LOCALIZED GENERALIZED WANNIER . . . 12 849

u nk~ r! 5e 2ik•rc nk~ r! . ~3! Eq. ~10! that results when the U are chosen diagonal. The
transformation ~10! does not preserve the individual Wannier
As shown by Blount,3 matrix elements of the position opera- centers, but does preserve the sum of the Wannier centers,
tor between Wannier functions take the form modulo a lattice vector.15 We shall frequently refer to this

E
V freedom as a ‘‘gauge freedom’’ and the transformation ~10!
^ Rn u ru 0m & 5i dke ik•R^ u nku ¹ ku u mk& , ~4! as a ‘‘gauge transformation.’’
~ 2p !3 Our goal is to pick out, from among the many arbitrary
the converse relation being choices of Wannier functions, the particular set that is maxi-
mally localized according to some criterion. Our choice of
criterion is introduced and justified in the following section.
^ u nku ¹ ku u mk& 52i ( e 2ik•R^ Rn u ru 0m & . ~5! Of course, some arbitrariness will remain: ~i! there will al-
R
ways be an arbitrary overall phase of each of the J Wannier
In equations like these the ¹ k is understood to act to the functions;32 ~ii! there is a freedom to permute the J Wannier
right, i.e., only on the ket. The consistency of these two functions among themselves; and ~iii! there is a freedom to
equations is easily checked; the latter can be derived by not- translate any one of the J Wannier functions by a lattice
ing that vector ~that is, to decide which Wannier functions belong to
the ‘‘home’’ unit cell labeled by R50). Aside from these
^ u nku u m,k1b& 5 ^ c nku e 2ib–ru c m,k1b& trivial remaining degrees of freedom, we expect to find a
unique set of maximally localized Wannier functions.
5 (R e 2ik–R^ Rn u e 2ib–ru 0m & , We should mention that related approaches have been
proposed in the literature ~see Refs. 31, 33–36, and prior
and then equating first orders in b. Similarly, equating sec- attempts in Refs. 37, 38! in order to construct localized Wan-
ond orders in b leads to nier functions starting from first-principles Bloch orbitals. In
general, they have relied on separate, heuristic choices for
^ Rn u r 2 u 0m & 52
V
~ 2p !3
E dke ik•R^ u nku ¹ 2ku u mk& . ~6!
the U mn in Eq. ~10! and the f n in Eq. ~9!. The former trans-
formation is used to remove the nonanalyticities at points of
degeneracy in the Brillouin zone, and the latter one ~in the
Introducing the notation r̄ n 5 ^ 0n u ru 0n & and ^ r 2 & n spirit of Ref. 33! is applied separately to each resulting Wan-
5 ^ 0n u r 2 u 0n & for the diagonal elements in the cell at the nier function to make it more localized. Although such ap-
origin, we have proaches can provide reasonably localized Wannier func-
tions in many cases, they do not provide the maximally
r̄ n 5i
V
~ 2p !3
E dk^ u nku ¹ ku u nk& ~7! localized set according to a pre-defined criterion, nor can
they easily be generalized to systems having low symmetry.
and

E
III. SPREAD FUNCTIONAL IN REAL SPACE
V
^ r & n5
2
dkuu ¹ ku nk& u .
2
~8!
~ 2p !3 As a measure of the total delocalization or spread of the
Wannier functions, we introduce the functional
This last follows from Eq. ~6! after an integration by parts.

(n @ ^ r 2 & n 2 r̄ n2#
C. Arbitrariness in definition of Wannier functions
V5 ~11!
As is well known, Wannier functions are not unique. For
a single isolated band, the freedom in choice of the Wannier
functions corresponds to the freedom in the choice of the
~recall r̄ n 5 ^ r& n ). Eq. ~11! is to be minimized with respect to
phases of the Bloch orbitals as a function of wave vector k.
the unitary transformations U (k) mn . A functional of this form
Thus, given one set of Bloch orbitals and associated Wannier
has previously appeared as one possible definition24 of the
functions, another equally good set is obtained from
‘‘localized molecular orbitals’’21–26 discussed in the chemis-
u u nk& →e i f n ~ k! u u nk& , ~9! try literature. Other localization criteria, such as maximizing
the sum of Coulomb self-energies of the orbitals23 or the the
where f n is a real function of k. Such a transformation pre- product of the separations of the centroids22 have also been
serves the Wannier center r̄ n modulo a lattice vector,3,15,16 suggested. We focus on the Wannier function obtained by
but of course it does not preserve the spread ^ r 2 & n 2 r̄ 2n . minimizing Eq. ~11! for the following reasons. ~i! The Wan-
For a composite set of bands, the corresponding freedom nier functions so determined correspond precisely to those
is considered by previous authors for the isolated-band case in
one dimension ~1D! Refs. 2, 3 and 39 and 3D.3 ~ii! In the 1D
multiband case, the optimally localized Wannier functions
u u nk& → (m U ~mnk!u u mk& , ~10! defined by minimizing Eq. ~11! turn out to be identical to the
eigenfunctions of the projected position operator Px P,39,40 as
where U mn is a unitary matrix that mixes the bands at wave will be demonstrated shortly. ~Here P is the projection op-
vector k. Equation ~9! can be regarded as a special case of erator onto the group of bands under consideration,
12 850 NICOLA MARZARI AND DAVID VANDERBILT 56

Clearly Ṽ vanishes, and since V I is gauge invariant, this


P5 (
Rn
u Rn &^ Rn u 5 ( u c nk&^ c nku ,
nk
~12!
minimizes Eq. ~13!. Thus in 1D the solution is essentially
trivial, even in the multiband case, and V min5V I at the so-
and Q512 P is the projection operator onto all other lution.
bands.! ~iii! It is one of the functionals proposed in the From this point of view, it can now be understood that the
chemistry24 and physics literature33,34,36,31 ~but where the essential difficulty in the three-dimensional case is that the
second term on the LHS of Eq. ~11! is usually neglected!. operators Px P, Py P, and Pz P do not commute ~or, in the
~iv! It leads to a particularly elegant formalism, allowing, for language of Appendix A, that matrices X, Y , and Z do not
example, the decomposition into invariant, diagonal, and off- commute.! For if they did, one could choose the Wannier
diagonal contributions as described below. functions to be simultaneous eigenfunctions of all three, and
We find it convenient to decompose the functional ~11! one could again make Ṽ vanish. But this is not generally the
into two terms, case, and the problem is to find a set of Wannier functions
that makes the best possible compromise in the attempt to
V5V I1Ṽ, ~13! diagonalize all three simultaneously. Indeed, it appears very
natural that the criterion should be simply to reduce, as far as
where possible, the mean-square average of all off-diagonal matrix
elements of x, y, and z between Wannier functions; this is

V I5 (n F ^ r 2 & n 2 ( u ^ Rm u ru 0n & u 2
Rm
G ~14!
precisely the criterion encoded into Ṽ. A procedure for car-
rying out this minimization directly in real space is sketched
in Appendix A. However, for crystalline solids with periodic
and boundary conditions, it is more straightforward to work in k
space as discussed in the following section.
Finally, for later reference, it is useful to decompose Ṽ
Ṽ5 (n RmÞ0n
( z^ Rm u ru 0n & z2 . ~15! into band-off-diagonal and band-diagonal pieces,

Clearly the second term is positive definite. While it is not Ṽ5V OD1V D , ~18!
immediately obvious, the first term is also positive definite, where
and, moreover, it is gauge invariant ~i.e., independent of the
choice of unitary transformations among the bands!. To see
this, we use the definitions of P and Q in terms of the Wan- V OD5 ( (R z^ Rm u ru 0n & z2 .
mÞn
~19!
nier functions to write
and
V I5 ( ^ 0n u r a Qr a u 0n &
na
V D5 (n RÞ0
( z^ Rn u ru 0n & z2 . ~20!

5 (a trc@ Pr a Qr a #
IV. SPREAD FUNCTIONAL IN k SPACE
5 i PxQ i 2c 1 i PyQ i 2c 1 i PzQ i 2c . ~16!
A. Transition to k space
Here trc indicates the trace per unit cell, and i A i 2c 5trc@ A † A # . We now derive expressions for V, V I , Ṽ, etc. in terms of
The last form makes it obvious that V I is positive definite. a discretized k-space mesh. We begin by substituting expres-
Operators of the form PrQ have been discussed extensively sions ~7! and ~8! into Eq. ~11!, and making use of
by Nenciu;41 unlike r itself, PrQ commutes with lattice
translations, and its expectation value is well defined in any
~normalizable! extended state. Thus, it follows that V I is
V
~ 2p !3
E dk→
1
N (k , ~21!
gauge invariant ~i.e., invariant with respect to the choice of
Wannier functions, or equivalently to the choice of the uni- where N is the number of real-space cells in the system, or
tary mixing matrices U „k…
mn ). This will become even clearer in equivalently, the number of k-points in the Brillouin zone.
Sec. IV, where V I is expressed in a finite-difference k-space Using the finite-difference expressions for ¹ k and ¹ 2k intro-
representation. duced in Appendix B, we have
It was stated earlier that in 1D the set of Wannier func-
tions that minimizes the spread functional, Eq. ~11!, turns out i
to be identical to the set of eigenfunctions of the projected r̄ n 5 ( w b@ ^ u nku u n,k1b& 21 #
N k,b b
~22!
position operator Px P. This can now be seen as follows.
Choose the Wannier functions u 0m & to be eigenfunctions of and
Px P with associated eigenvalues x̄ 0m . Then
1
^ Rn u x u 0m & 5 ^ Rn u Px P u 0m & 5 x̄ 0m d R,0d m,n . ~17!
^ r 2& n5 ( w @ 222 Re^ u nku u n,k1b& # .
N k,b b
~23!
56 MAXIMALLY LOCALIZED GENERALIZED WANNIER . . . 12 851

Here b are vectors connecting each k point to its near neigh- biguity in the choice of branch when evaluating lnM (k,b)
nn . Of
bors and w b are associated weights ~see Appendix B!. course, it is not invariant under an arbitrary gauge transfor-
Clearly, these expressions reduce to Eqs. ~7! and ~8! in the mation, Eq. ~10!#.
limit of dense mesh spacing (N→`, b→0). However, we Note that expression ~32! for ^ r 2 & n is not unique, even
should like to insist on a second desirable property as well: when insisting on the invariance condition ~24!. For ex-
namely, that for a given k mesh, r̄ n and ^ r 2 & n should trans- ample, replacing
form as expected when the definition of u 0n & is shifted by a
lattice vector. ~This corresponds to changing the choice of 12 u M ~nnk,b! u 2 →22 Re lnM ~nnk,b! ~33!
which Wannier functions belong to the ‘‘home’’ unit cell.!
That is, when u u nk& → u u nk& e 2ik•R, so that ^ u nku u n,k1b& results in an equally valid finite-difference formula for V.
→ ^ u nku u n,k1b& e 2ib•R , we should find However, use of the form ~32! facilitates a connection with
the decomposition of V5V I1V OD1V D into invariant, off-
r̄ n → r̄ n 1R, diagonal, and diagonal components as in Eqs. ~13! and ~18!.
Following the lines of the formalism above, one finds that
^ r 2 & n → ^ r 2 & n 12 r̄ n •R1R 2 , ~24! Eq. ~14! becomes

S D
so that V will be unchanged. Expressions ~22! and ~23! do
1
not obey these requirements, but can be modified to do so.
As long as the modifications leave the summands unchanged
V I5 ( w J2
N k,b b mn
( k,b! 2
u M ~mn u

to order b and b 2 in Eqs. ~22! and ~23!, respectively, they


1
will still reduce to Eqs. ~7! and ~8! in the continuum limit.
Let
5 ( w tr@ P ~ k! Q ~ k1b! # ,
N k,b b
~34!

k,b!
M ~mn 5 ^ u mku u n,k1b& ~25! where P (k) 5 ( n u u nk&^ u nku , Q (k) 512 P (k) , and the band in-
dices m,n run over 1, . . . , J. Similarly, Eqs. ~19! and ~20!
and, for a given n, k, and b̂, let become
1 1
M ~nnk,b! 511ixb1 yb 2 1O ~ b 3 ! .
2
~26! V OD5 ( w
N k,b b mÞn( k,b! 2
u M ~mn u ~35!

By expanding ^ u n,k1bu u n,k1b& 51 order by order in b, it is


and
easy to check that x and y are real. Then, referring to Eqs.
~22! and ~23!, we have
1
M ~nnk,b! 215ixb1O ~ b 2 ! , ~27!
V D5 ( w
N k,b b (n ~ 2Im lnM ~nnk,b!2b• r̄ n ! 2 . ~36!

222 ReM ~nnk,b! 52yb 2 1O ~ b 3 ! . ~28! From these expressions, it is again evident that V I , V OD ,
and V D are all positive definite.
It is also easy to check that Equation ~34! also now shows clearly that V I is gauge-
invariant @i.e., independent of the choice of the Wannier
ixb5i Im lnM ~nnk,b! 1O ~ b 2 ! , ~29! functions, Eq. ~10!#. Heuristically, V I represents the degree
of dispersion of the band projection operator P „k… through the
2yb 2 512 u M ~nnk,b! u 2 1x 2 b 2 1O ~ b 3 ! . ~30! Brillouin zone. That is, V I is small insofar as P „k… is nearly
Thus, in place of Eq. ~22! we write independent of k. ~Note that tr@ P 1 Q 2 # 5 i P 1 2 P 2 i 2 /2 repre-
sents the ‘‘spillage,’’ or degree of mismatch, between the
1 spaces 1 and 2.! Since V I is invariant with respect to gauge
r̄ n 52 ( w bIm lnM ~nnk,b! ,
N k,b b
~31! transformations ~10!, it can be evaluated once and for all in
the initial gauge ~i.e., using the initial u nk) before performing
and, in place of Eq. ~23!, the minimization procedure outlined below.
It is amusing to note, following the ideas of Refs. 42–44,
1 that one can define a ‘‘quantum distance’’ between two wave
^ r 2& n5 ( w $ @ 12 u M ~nnk,b! u 2 # 1 @ Im lnM ~nnk,b! # 2 % .
N k,b b
~32!
vectors k and k8 as dl 2 5tr@ P „k…Q „k8…# , thus inducing a met-
ric upon the k space. The invariant part of the spread func-
When inserted in Eq. ~11!, this gives our operational defini- tional V I turns out to be nothing other than the Brillouin-
tion of the spread functional V. zone average of the trace of this metric. We discuss the
It is easy to check that Eqs. ~31! and ~32! obey conditions properties of this metric, and speculate about its utility, in
~24! exactly, while still reducing to Eqs. ~7! and ~8! in the Appendix C.
continuum limit. The expression for the Wannier center, Eq.
~31!, is strongly reminiscent of the Berry-phase expression of
B. Gradient of spread functional
Refs. 15 and 16, and reduces to it for an isolated band in 1D.
@It is also exactly invariant, modulo a lattice vector, under We now consider the first-order change of the spread
any change of phases of the form of Eq. ~9!, provided that functional V arising from an infinitesimal gauge transforma-
the phases still vary smoothly enough with k to prevent am- tion, Eq. ~10!, given by
12 852 NICOLA MARZARI AND DAVID VANDERBILT 56

k! k!
U ~mn 5 d mn 1dW ~mn , ~37! 2
dV D5 ( w
N k,b b (n q ~nk,b!Im@ 2dW ~ k! R̃ ~ k,b!
where dW is an infinitesimal antiunitary matrix,
dW † 52dW, so that
1dW „k1b…R̃ ~ k1b,2b! # nn . ~49!

u u nk& → u u nk& 1 (m dW ~mnk!u u mk& . ~38! Substituting q (k1b,2b)


n
bined, resulting in
52q (k,b)
n , the two terms can be com-

We seek an expression for dV/dW „k…


mn . We use the conven- 4
tion dV D52 ( w Im tr@ dW ~ k! T ~ k,b! # ,
N k,b b
~50!

S D
dF
dW nm
5
dF
dW mn
~39! where
k,b! k,b! ~ k,b!
T ~mn 5R̃ ~mn qn . ~51!
~note the reversal of indices!, so that
We thus arrive at the desired expression for the gradient of
dtr@ dWB # the spread functional,
5B, ~40!
dW
dV
d Re tr@ dWB #
5A@ B # , ~41!
G ~ k! 5
dW ~ k!
54 (b w b~ A@ R ~ k,b! # 2S@ T ~ k,b! # ! . ~52!
dW
We note, for completeness, that making the replacement ~33!
d Im tr@ dWB # has just the effect of replacing R by R̃ in the first term above.
5S@ B # , ~42!
dW The condition for having found a minimum is that the
above expression should vanish. We discuss the numerical
where A and S are the superoperators A@ B # 5(B2B † )/2 and minimization of the spread functional by steepest descents,
S@ B # 5(B1B † )/2i. As we shall see shortly, it is possible to using this gradient expression, in Sec. IV D.
cast dV into the form of the numerators of Eqs. ~41! and
~42!.
C. Special cases
For the present purpose it is convenient to write
V5V I,OD1V D , where V D is the diagonal part given by Eq. 1. One dimension
~36!, and the invariant and off-diagonal parts are combined
As mentioned in Sec. III, in 1D it should be possible to
into
choose the Wannier functions to be eigenfunctions of the
V I,OD5V I1V OD band-projected position operator Px P, and thus to make
Ṽ5V OD1V D vanish. Unfortunately, on a finite k mesh Ṽ
1
5 ( w
N k,b b (n @ 1 2 u M ~nnk,b!u 2 # . ~43! cannot generally be made to vanish completely. At the mini-
mum, V D does vanish, but V OD does not, leaving a remain-
der that is expected to approach zero as O(b 2 ) with mesh
From Eq. ~38! it follows that
spacing b.
dM ~nnk,b! 52 @ dW ~ k! M ~ k,b! # nn 1 @ M ~ k,b! dW ~ k1b! # nn . First, note that starting from any given gauge, it is
~44! straightforward to adjust the phases of the u u nk j & in order to
make V D50 without affecting V OD whatsoever. For each n,
Using M 5@ M
(k,b)
# and dW52dW , the second
(k1b,2b) † †
(k j ,1b)
term in Eq. ~44! can be transformed to become let l n 5s n / u s n u where s n 5 ) N21
j50 M nn ~thus l n is the
2 @ dW (k1b) M (k1b,2b) # * ‘‘Berry phase’’ of band n); then, starting from the first point
nn . Defining
j50, recursively set the phase of u u n,k j 1b & such that
k,b! k,b! ~ k,b!
R ~mn 5M ~mn M nn * , ~45! (k ,1b)
M nnj 5l 1/N
n , for successive k points j. Then all the
(k ,1b)
we thus find M nnj will have the same phase and V D will vanish. This
operation has no effect whatsoever on the magnitudes of the
4
dV I,OD5 (
N k,b b
k,b!
w Re tr@ dW ~ k! R ~mn #. ~46! elements of M (k,1b)
mn , and so, by Eq. ~35!, it leaves V OD
unchanged. This argument demonstrates that V D50 and
Similarly, defining thus Ṽ5V OD at the minimum.
A good starting guess that will make V OD rather small
q ~nk,b! 5Im ln M ~nnk,b! 1b• r̄ n ~47! ~and keep V D50) can be constructed as follows. We first
establish a notion of ‘‘parallel transport’’ of the Bloch func-
and tions. Starting with some arbitrary choice ~from among all
k,b! possible J3J unitary rotations! of the u u nk 0 & at an initial k
M ~mn
k,b!
R̃ ~mn 5 , ~48! point k 0 , we choose the u u n,k 0 1b & at the next point k 0 1b by
M ~nnk,b! (k ,1b)
insisting that M mn0 should be Hermitian. @This choice is
Eq. ~36! gives for the diagonal part uniquely given by the singular value decomposition
56 MAXIMALLY LOCALIZED GENERALIZED WANNIER . . . 12 853

M 5VSW † , where V and W are unitary and S is a diagonal


matrix with nonnegative diagonal elements. Then G „k…54i (b w b Im ln M „k,b…. ~53!
M 5(VSV † )(VW † ); and by appropriate unitary rotation, the
VW † term can be eliminated, leaving M Hermitian.# This At the solution, this expression must vanish. Starting from
procedure is repeated, progressing from k point to k point some initial guess on the phases of the u u k& and making the
@and using u n,2 p /a (x)5u n, p /a (x)e 2 p ix/a when crossing the substitution of Eq. ~9!, it can be seen that Eq. ~53! corre-
Brillouin-zone boundary# until the loop is completed, estab- sponds to a solution of the Laplace equation for the phase
lishing a new set of states at k 0 that are related to the initial field f (k). This corresponds closely to the discussion in the
ones by a unitary transformation L. ~This matrix L is the vicinity of Eq. ~5.15! of Ref. 3.
generalization of the Berry phase45 to a non-Abelian multi- The quantity 2 ( b w b bIm lnM (k,b) is a finite-difference
dimensional manifold.20,46–48! Next, one diagonalizes representation of the vector field A(k)5i ^ u ku ¹ ku u k& ; in the
L5VlV † , and rotates the bands at every k point by the same language of the theory of geometrical phases, A(k) is known
unitary matrix V. Having done this, one finds that each state as the ‘‘gauge potential’’ or ‘‘Berry connection.’’20,45,47 The
u u nk 0 & is carried onto itself by parallel transport around the average value of A(k) is gauge-invariant ~modulo a quan-
loop, except that it returns with an excess phase l n . Finally, tum! and is set by the Berry phase,15–17 but A(k) is locally
defining g n 5l 21/N n and modifying the phases as gauge-dependent. The minimization of V via the solution of
u u nk j & → g nj u u nk j & , we arrive at the desired solution ~parallel- the Laplace equation selects the gauge that makes ¹•A van-
ish, but its curl, B5¹3A, is generally nonzero. In fact, B,
transport gauge!.
which is known as the ‘‘Berry curvature,’’ is a gauge-
At this solution, each M (k,1b) mn mn g n with K Hermitian.
5K (k)
invariant quantity; it can be regarded as an intrinsic property
(k,1b)
It follows that the Im lnM nn are independent of k, the of the band.20,48
Wannier centers x̄ n are determined by the l n , and thus V D Since A(k) is periodic in k space, one can alternatively
of Eq. ~36! vanishes. From Eq. ~35! it can be seen that V OD think in terms of the Fourier coefficients A(R). These can be
does not generally vanish. However, V OD depends only on divided into three contributions: the uniform part, A(R50);
the matrices K (k) mn , and these can be shown to scale as and, for RÞ0, the longitudinal and transverse parts A L(R)
d mn 1O(b ), so that u M (k,1b)
2
mn u 2 is expected to scale as and AT(R), i.e., the components of A„R… parallel and per-
O(b ), and V OD as O(b ).
4 2
pendicular to R̂, respectively. The uniform part gives the
If a minimization of V is then carried out starting from Wannier center; the longitudinal part is the part that can be
this parallel-transport solution, one expects V D to remain made to vanish by appropriate choice of gauge; and the
zero and V OD to be reduced slightly, the reduction again transverse part is gauge invariant ~it is related to the Berry
being expected to be O(b 2 ). The Wannier centers will also curvature! and determines the minimum value of V D . In
presumably shift slightly. fact, the individual Fourier components A(R) can be related
If one is mainly interested in the Wannier centers in the to the matrix elements ^ Ru ru 0& of Eq. ~15!; it thus follows
one-dimensional case, it may be preferable to take these from that at the solution, the latter are purely transverse,
the parallel-transport solution ~i.e., from the l n ), rather than A(R)•R50. Unfortunately, the picture does not appear to
from the x̄ n at the minimum. The former approach corre- remain so simple in the multiband case, as discussed in Ap-
sponds more closely with the Berry-phase viewpoint,15–17,20 pendix C.
and indeed the sum of the Wannier centers so defined corre- The Berry curvature, or equivalently, the transverse part
sponds to the usual formula for the electronic of the Berry connection, can easily be shown to vanish for an
polarization.15,16 ~Actually, for this purpose, the full parallel- isolated band in a crystal with inversion symmetry ~see Sec.
transport construction need not be carried out. L may be IV C 3!; in this case the solution for A(k) is a perfectly
calculated as the product of the unitary parts of the M ma- uniform one, and V D vanishes at the solution. In a noncen-
trices in any given representation, and the l n obtained as its trosymmetric crystal, however, this is not the case, since a
eigenvalues. By ‘‘unitary part’’ we mean the VW † taken nonzero Berry curvature is generally present. This provides a
from the singular value decomposition M 5VSW † .! On the complementary viewpoint, for the single-band case, on the
other hand, the parallel-transport formulation does not easily fact that the noninvariant part Ṽ of the spread functional
generalize to higher dimensions. Thus, the approach of mini- cannot generally be made to vanish.
mizing the V functional appears to be the most natural one in
higher dimensions, and it gives results that differ only very 3. Inversion symmetry
slightly from the parallel-transport solution for reasonable
When inversion symmetry V(r)5V(2r) is present, the
meshes in 1D.
cell-periodic Bloch functions can be chosen to be real in
the reciprocal representation; that is, u nk(r)
2. Isolated band in multiple dimensions
5 ( Gu nk(G)exp(iG•r) with u nk(G) real. It might naively
For the case of an isolated band in multiple dimensions, appear that all the M (k,b)
mn matrices could then be chosen real,
the problem of finding the optimally localized Wannier func- and that the solution of the minimization problem might be
tion maps onto the problem of solving the Laplace equation trivial in some sense. This is not quite true. Even for an
for a phase field,3,49 as described next. V OD is not present, isolated band, there is the complication that the Berry phase
and the problem reduces to minimizing V D , so that only the of the band may be 21 instead of 11; in this case the
second term in Eq. ~52! appears. Clearly R̃ is identically one u nk(G) can be chosen real locally ~i.e., in a small neighbor-
and T (k,b) 5q (k,b) is real, so that Eq. ~52! becomes hood around any given k), but not globally. But this really
12 854 NICOLA MARZARI AND DAVID VANDERBILT 56

only means that the corresponding Wannier function has A to the matrices X, Y , Z computed as X mn 5 ^ u mk0 u x˜u u nk0 & ,
definite symmetry under inversion through a symmetry cen-
etc. Approach ~i! is a ‘‘quick fix’’ requiring very little repro-
ter ~‘‘Wyckoff position’’! other than the one at the origin,
gramming, while approach ~ii! is preferable in principle.
and the Berry phase can be reset to 11 by a shift of origin.
It is also common practice to use single k-point sampling
For the case of composite bands, however, the problem is to
for supercell calculations on extended systems, provided that
choose a particular gauge transformation @Eq. ~10!#, not just
the supercell is sufficiently large in all three dimensions. In
a phase transformation @Eq. ~9!#, and for this the presence of
such cases, our procedure can again be applied, but it should
inversion symmetry does not provide any obvious solution.
be kept in mind that the convergence of V with supercell
For example, consider the case of the four valence bands
size should be expected to be slower than the convergence of
of Si. ~Numerical results for this case appear in Sec. VI A.!
total energies and forces. Moreover, the electronic polariza-
Taking the origin at the center of the bond oriented along
tion that would be computed from the sum of our Wannier
@ 111# , it turns out to be possible to choose one of the Wan-
centers is not guaranteed to be exactly identical to the one
nier functions to have inversion symmetry about the origin,
that would be computed from the Berry-phase formula,15
while the other three have inversion symmetry about other
Wyckoff positions ~those corresponding to the other three 22e
bond centers!, and the remaining Wyckoff positions ~tetrahe- Pel•G5 Im ln det^ u mk0 u e 2iG•ru u nk0 & , ~54!
dral and octahedral interstitial positions! are unoccupied.4 V
This would have been hard to guess based on symmetry used in recent molecular-dynamics simulations of infrared
alone ~although it is natural from a chemical point of view!. absorption spectra.51 However, the two should be very close,
Because each Wannier function does have its own inversion and should become identical in the limit of large supercell
symmetry, it turns out that V D does vanish for Si. However, size.
V ODÞ0. The contribution to V OD from a given pair $ mn % of
Wannier functions is related to the matrix elements D. Steepest-descent minimization
^ Rm u ru 0n & . These matrix elements can be shown to vanish
if, in addition to obeying inversion symmetry individually, 1. Algorithm
the two Wannier functions are translational images of one In order to minimize the spread functional V by steepest
another; but this is certainly not generally the case. ~In the descents, we make small updates to the unitary matrices, as
language of Appendix C, the fact that V ODÞ0 for Si is re- in Eq. ~37!, choosing
lated to the fact that the Berry curvature tensor does not
vanish for this system.! dW ~ k! 5 e G ~ k! , ~55!
Finally, in some cases it might be possible to choose all
the Wannier functions to have definite symmetry under in- where e is a positive infinitesimal. We then have, to first
version, but the solution that minimizes V may spontane- order in e ,
ously break the inversion symmetry. Some cases of this sort
are discussed in Secs. VI C and VI D below. dV5 (k tr@ G ~ k!dW ~ k! #
4. Molecular supercells and single k-point sampling
In the context of plane-wave pseudopotential and related
52 e (k i G ~ k! i 2 , ~56!
approaches, it is common to study molecules or clusters in an
artificial periodic superlattice arrangement.50 In such a case, where i A i 2 5 ( mn u A mn u 2 and we have made use of G † 52G.
a single k-point ~usually k0 5G) sampling of the Brillouin Thus, the use of Eq. ~55! is guaranteed to make dV,0, i.e.,
zone suffices for conventional quantities such as energies, to reduce V.
forces, and charge densities, since the errors in these quanti- In practice, we take a fixed finite step with e 5 a /4w,
ties will be exponentially small as long as the overlap be- where w5 ( bw b , so that
tween wave functions in neighboring supercells is negligible.
a
However, under the same conditions, the calculation of V
using our approach introduces small errors that nevertheless
DW ~ k! 5
w (b w b~ A@ R ~ k,b! # 2S@ T ~ k,b! # ! . ~57!

scale only as L 22 , where L is the supercell dimension ~see,


The wave functions are then updated according to the matrix
e.g., Sec. VI C!. The problem essentially arises from the use
exp@DW(k) # , which is unitary because DW is anti-Hermitian.
of the simplest finite-difference representation of ¹ k , involv-
The choice of prefactor above is designed so that in the
ing only nearest-neighbor k points ~see Appendix B!. If
single-band case, and for simple k meshes ~e.g., simple cu-
higher accuracy is needed, this problem can be overcome in
bic!, the ‘‘highest-frequency mode’’ associated with phase
either of two ways: ~i! by using the solution at k0 to con-
rotations is just marginally stable with the choice a 51. That
struct solutions on a denser mesh of k points,
is, if one starts with the true solution and rotates the phases
u k(r)5u k0 (r)exp@i(k0 2k)•r# , being sure to take the discon-
of the wave functions on all k points simultaneously by an
tinuity of (k0 2k)•r near the supercell boundary where angle 6 g , with the opposite sense of rotation on nearest-
u k0 (r) is negligible; or ~ii!, construct periodic functions neighbor k points, then from Eq. ~47! Dq (k,b) 562 g on ev-
x̃ (r), ỹ (r), z̃ (r) such that x̃ 5x, ỹ 5y, z̃ 5z in the molecu- ery link, and the above choice of DW exactly returns the
lar region, with ~possibly smoothed! discontinuities at the system to the solution if a 51/2, and is marginally unstable
supercell boundaries, and then apply the theory of Appendix at a 51. We find that a 51 is still a safe choice for all the
56 MAXIMALLY LOCALIZED GENERALIZED WANNIER . . . 12 855

systems studied; more efficient strategies become useful a starting point for the steepest-descent procedure. In prac-
when dealing with large systems, or very fine k-point tice, we find that this starting guess is usually quite good, as
meshes. In those cases, it is advantageous to choose at each will be shown for the cases of Si and GaAs in Sec. VI.
step the optimal a in a line minimization ~usually with a
parabolic interpolation, using the functional at a50, 1, and 2. False local minima
its derivative at a50! or to introduce a conjugate-gradient We have also carried out tests initializing the minimiza-
approach in composing subsequent descent directions. tion procedure with more arbitrary starting guesses. For ex-
It should be stressed that the evolution towards the mini-
ample, we have let the starting u (0)
nk consist of energy-ordered
mum requires only the relatively inexpensive updating of the
Hamiltonian eigenstates with quasi-random phases, as in the
unitary matrices, and not of the wave functions, as follows.
typical output of a band-structure code. We have also tried
nk & and com-
We choose a reference set of Bloch orbitals u u (0) superimposing a completely random phase rotation to each
pute once and for all the inner-product matrices
u (0)
nk individually, or a random J3J unitary rotation to the set
0 !~ k,b!
M ~mn 0! ~0!
5 ^ u ~mk u u n,k1b& . ~58! of u (0)
nk at each and every k. With such starting guesses, we
find that the minimization procedure can occasionally get
We then represent the u u nk& ~and thus, indirectly, the Wan- trapped in a local minimum. That is, we find that the spread
nk & and a set of unitary
nier functions! in terms of the u u (0) functional V, viewed as a function of the set of U „k… mn , does
matrices U „k…
mn , have false local minima that must be avoided.
We find that this problem is not associated with the pres-
u u nk& 5 (m U ~mnk!u u ~mk0 !& . ~59! ence of a large number of bands, but instead with the use of
fine k-point meshes. In fact, rather counter-intuitively, we
have experienced it so far only when treating isolated bands.
mn initialized to d mn . Then, each
We begin with all the U (k) The Wannier functions associated with the false local
step of the steepest-descent procedure involves calculating minima are found to display erratic and unphysical oscilla-
DW from Eq. ~57!, updating the unitary matrices according tions.
to The problem appears to lie in the possibility of making
inconsistent choices in the branch cuts when evaluating the
U ~ k! →U ~ k! exp@ DW ~ k! # , ~60!
logarithms of complex argument in ~47!. In a naive imple-
and then computing a new set of M matrices according to mentation, the branch cuts are simply chosen so that
u q (k,b)
n u < p . At a good global minimum, all of the u q (k,b)
n u
M ~ k,b! 5U ~ k! † M ~ 0 !~ k,b! U ~ k1b! . ~61! ! p , while at a false local minimum some of the u q (k,b) u
n
The cycle is then repeated until convergence is obtained. approach p.
Note that the exponential in Eq. ~60! is a matrix operation, On the other hand, we have never observed the system to
which we perform by transforming to a diagonal representa- become trapped in a false local minimum when starting from
tion of DW and back again. reasonable trial projection functions, Eqs. ~62!–~64!. We
Typically, we prepare a set of reference Bloch orbitals also find that at the true global minimum the Wannier func-
tions always turn out to be real, apart from a trivial overall
nk & by projecting from a set of initial trial orbitals g n (r)
u u (0)
corresponding to some rough initial guess at the Wannier phase; while at the false local minima, they are typically
functions. For example, for these g n (r) we have used Gauss- complex, only being real if the initial conditions described in
ian functions centered at or near midbond positions. The ini- Sec. V B have been used.
tialization procedure involves first projecting onto Bloch In summary, while false local minima can occur in our
states of the set of bands at wave vector k, minimization scheme, they do not seem to pose any foresee-
able problem in actual calculations.

u f nk& 5 (m u c mk&^ c mku g n & . ~62!


V. PROPERTIES OF OPTIMALLY LOCALIZED
WANNIER FUNCTIONS
Since these are not orthonormal, we then perform a symmet-
ric orthonormalization to form a set of A. Asymptotic localization properties
Following from the early work of Kohn,2 it is generally
u f̃ nk& 5 (m ~ S 21/2
! mn u f mk& ~63! expected that Wannier functions can be chosen to have ex-
ponential localization. While it is not the purpose of the
present work to study questions of exponential decay in the
~where S mn 5 ^ f mku f nk& ), and finally convert to cell-periodic
tails of the Wannier functions, we nevertheless give a brief
functions via
discussion of these issues here.
0!
u ~nk Kohn2 proved the existence of exponentially localized
~ r! 5e 2ik•r f̃ nk~ r! . ~64!
Wannier functions for the case of an isolated band in 1D, for
~In practice, the above steps are combined.! This procedure a crystal with inversion symmetry. However, the method
is similar in principle to the one mentioned by Teichler33 does not easily generalize. Blount demonstrated the analyt-
~following Ref. 54!, or Satpathy and Pawlowska,35 although icity of the Bloch functions for the single-band case in 3D,3
it differs from the latter in that we do the orthonormalization and claimed ~end of Sec. 5 of Ref. 3! that this would imply
in k space. We then use this set of reference Bloch orbitals as the exponential localization of the Wannier functions ~see
12 856 NICOLA MARZARI AND DAVID VANDERBILT 56

also Ref. 49!; but this claim was later shown to be faulty by Wannier functions projected from real trial functions, as dis-
Nenciu ~footnote on first page of Ref. 52!, who pointed out cussed in Sec. IV D; alternatively, it can be imposed by
the global topological aspects of the problem. Des Cloizeaux hand. From Eq. ~25!, condition ~65! implies that M (k,b) mn is
proved the exponential localization of the band projection equal to M (2k,2b) * , which in turn implies that G (k)
is equal
mn mn
operator P of Eq. ~12! for an arbitrary set of composite bands
mn * , so that Eq. ~65! continues to be satisfied during
to G (2k)
in 3D.53 Unfortunately, this does not immediately imply that
the steepest-descent update procedure. In this way, one will
the Wannier functions are exponentially localized ~although
the converse would follow!. In a following paper, des eventually arrive at a set of maximally localized real Wan-
Cloizeaux was able to prove the possibility of choosing ex- nier functions. ~Similarly, working in real space, it is easy to
ponentially localized Wannier functions for an isolated band see from Appendix A that a real initial guess will result in a
~i! in 1D generally, or ~ii! in the centrosymmetric 3D case.54 set of real optimally localized solutions.!
The summary ~Sec. V! of Ref. 54 gives a good discussion of We conjecture that a stronger result is true: namely, that
the difficulties and partial progress towards a solution of the the optimally localized Wannier functions are always real
general composite-band problem. More recently, Nenciu ~apart from a trivial overall phase of each Wannier function!.
completed a proof for the case of an isolated band in 3D We have not found a proof of this conjecture, but it is
without centrosymmetry.52 To our knowledge, however, the supported by our empirical experience. More precisely, in
problem remains unsolved for the general case of composite the tests to be reported in Sec. VI, we find that whenever we
bands in 3D. Finally, note that some discussion of the expo- arrive at the global minimum, the Wannier functions always
nential localization of the ‘‘generalized Wannier functions’’ turn out to be real, apart from a trivial overall phase. @How-
defined for the cases of surfaces and defects has been given ever, we do find that the Wannier functions are typically
in Refs. 27 and 55–57. complex at false local minima, as discussed in Sec. IV D 2,
It is natural to speculate that the ‘‘optimally localized’’ and also that imposing the initial condition ~65! does not
Wannier functions that are obtained by minimizing the eliminate false local minima, even if in this latter case the
spread functional of Eq. ~11! are exponentially localized. Ac- local minima are necessarily real.#
tually, one should distinguish between a ‘‘weak conjecture’’
that the optimally localized Wannier functions have expo-
nential decay, and a ‘‘strong conjecture’’ that they have the VI. RESULTS
same exponential decay as that of the band projection opera-
tor P. At the present time, we can only speculate that in 3D, A. Si
the weak conjecture, at least, will hold. For Si, the four occupied valence bands have to be taken
In 1D, we are on firmer footing. As shown in Secs. III and together as a single composite group, because of degenera-
IV C 1, the functions that are obtained by minimizing Eq. cies between the bottom two bands at X, and between the top
~11! correspond, in principle, with those considered by pre- three bands at G. Thus, we take J54 and look for a set of
vious authors, and for which exponential localization has four Wannier functions per primitive unit cell. These are
been demonstrated.2,3,39,40 In particular, we have shown in expected to be centered on the bond centers, and to have
Sec. III that these will be eigenfunctions of the band- roughly the character of s -bond orbitals, i.e., even linear
projected position operator Px P; Niu has given a simple and combinations of the two sp 3 hybrids projecting toward the
elegant argument, based on this fact alone, from which one bond center from the two neighboring atoms.4 Wannier func-
may conclude that the Wannier functions decay faster than tions of this type have been computed previously by a vari-
any power.40 From this point of view, the essential difficulty ety of methods.35,58–61,31 It is tempting to imagine that the
in 3D is that the Wannier functions can no longer generally requirement of spanning the given set of valence bands, to-
be chosen to be eigenfunctions of all three band-projected gether with the symmetry requirement that each Wannier
position operators simultaneously. function has the expected inversion, mirror, and threefold
Returning to the general three-dimensional case, we find rotational symmetries about its corresponding bond center,
that it is not easy to carry out numerical tests of exponential might be enough to uniquely determine the Wannier func-
localization using the present method, which is based on dis- tions. We emphasize that this is not the case, and we proceed
cretization in k space. The Wannier functions that we obtain to determine the particular set of Wannier functions that
are thus not truly localized, being instead artificially periodic minimize the spread functional V.
with a periodicity inversely proportional to the mesh spacing. Our calculations are carried out within the local-density
approximation to Kohn-Sham density-functional theory,62
B. Conjecture: optimally localized Wannier functions are real using a standard plane-wave pseudopotential approach and
It seems not to be widely appreciated that the Wannier an all-bands conjugate-gradient minimization.63 We have
functions w n (r) can always be chosen real. This depends used norm-conserving pseudopotentials64 in the Kleinman-
only on the Hamiltonian H5p 2 /2m1V(r) being Hermitian, Bylander representation, with plane-wave cutoffs ranging
and not on any symmetry of the ~real! potential V(r). Indeed, from 200 eV to 650 eV, depending on the systems studied.
from Eq. ~1! it is clear that one only needs to choose The sampling of the Brillouin zone is performed with equi-
spaced Monkhorst-Pack grids65 that have been offset in order
u nk~ r! 5u n,2k
* ~ r… ~65! to include G. Since the crystal is fcc in real space, the grid is
bcc in k space, and we use the simplest possible finite-
to insure that the Wannier functions w n (r) are real. This difference representation of ¹ k using only the Z58 nearest
condition is automatically satisfied if one starts with initial neighbors of each k point ~see Appendix B!. The computed
56 MAXIMALLY LOCALIZED GENERALIZED WANNIER . . . 12 857

TABLE I. Minimized localization functional V in Si, and its


decomposition into invariant, off-diagonal, and diagonal parts, for
different k-point meshes ~see text!. Units are Å2.

k set V VI V OD VD

13131 2.024 1.999 0.025 0


23232 4.108 3.707 0.401 0
43434 6.447 5.870 0.577 0
63636 7.611 7.048 0.563 0
83838 8.192 7.671 0.520 0

Bloch functions are stored to disk, and the construction of


the Wannier functions is carried out as a separate, post-
processing operation.
Table I shows the convergence of the spread functional
and its various contributions as a function of the density of
the k-point mesh used. We confirm that V D does vanish ~to
machine precision! as expected from the presence of inver-
sion symmetry, as discussed in Sec. IV C 3. Since V I is in-
variant, the minimization of V reduces to the minimization
of V OD . For each k-point set, the minimization was initial-
ized by starting with trial Gaussians of width ~standard de-
viation! 1 Å located at the bond centers. We find that for the
case of crystalline Si, these provide an excellent starting
guess; for the 83838 case, for example, we find an initial
V D50 and V OD50.565, whereas at the minimum V OD is
0.520. Had we started with the random phases provided by
the ab initio code, we would have obtained an initial
V D5622.1 and V OD542.3. We find that typically 20 itera-
tions are needed to converge to the minimum with good
accuracy, starting with the initial choice of phases given by
the Gaussians, and using a simple fixed-step steepest-descent
procedure. Starting with a set of randomized phases requires FIG. 1. Maximally localized Wannier function in Si, for the
roughly one order of magnitude more iterations. As previ- 83838 k-point sampling. ~a! Profile along the Si-Si bond. ~b!
ously pointed out, the evolution does not require additional Contour plot in the ~110! plane of the bond chains. The other Wan-
scalar products between Bloch orbitals, and so it is in any nier functions lie on the other three tetrahedral bonds and are re-
case pretty fast. Because of symmetry, the Wannier centers lated by tetrahedral symmetries to the one shown.
do not move during the minimization procedure, and the
spreads of the four Wannier functions remain identical with trivial, and would not be satisfied by a generic choice of
each other. phases. ~Our initial guess based on Gaussians centered in the
What is perhaps most striking about Table I is that middle of the bonds does insure all these properties, but
V I@V OD ; and while V converges fairly slowly with k-point without optimizing the localization.!
density, this poor convergence is almost entirely due to the From an inspection of the contour plot it becomes readily
V I contribution. Incidentally, since the V I contribution is apparent that the Wannier functions are essentially confined
gauge invariant, it can be calculated once and for all at the to the first unit cell, with very small ~and decreasing! com-
starting configuration, for any given k-point set; the quanti- ponents in further-neighbor shells. The general shape corre-
ties that are actually minimized are V D and V OD . The sponds to a chemically intuitive view of sp 3 hybrids over-
former vanishes at the minimum, and the latter is found to lapping along the Si-Si bond to form a s bond orbital, with
converge quite rapidly with k-point sampling. It would be the smaller lobes of negative amplitude clearly visible in the
interesting to explore whether use of a higher-order finite- back-bond regions. These results clearly illustrate how the
difference representation of ¹ k might improve this conver- Wannier functions can provide useful intuitive understanding
gence, especially that of V I , but we have not investigated about the formation of chemical bonds.
this possibility.
In Fig. 1, we present plots showing one of these maxi-
B. GaAs
mally localized Wannier functions in Si, for the 83838
k-point sampling. The other three are identical ~related to the In GaAs the lower valence band is never degenerate with
first by the tetrahedral symmetry operations! and are located the other ~top! three valence bands, and thus several possi-
on the other three tetrahedral bonds. Each displays inversion bilities arise: ~a! We can treat the four bands as a group, as
symmetry about its own bond center, and it is real apart from was done for silicon, obtaining solutions that are very similar
an overall complex phase. Again, all these properties are not to the Si case, except for the loss of inversion symmetry
12 858 NICOLA MARZARI AND DAVID VANDERBILT 56

TABLE II. Minimized localization functional V in GaAs, and


its decomposition into invariant, off-diagonal, and diagonal parts,
for different k-point meshes, together with the relative position b of
the centers along the Ga-As bond ~see text!. Units for the V’s are
Å2.

k set V VI V OD VD b
13131 2.217 2.088 0.125 0.0035 0.593
23232 4.409 3.898 0.503 0.0078 0.602
43434 6.785 6.170 0.610 0.0055 0.613
63636 7.982 7.386 0.590 0.0058 0.616
83838 8.599 8.038 0.555 0.0059 0.617
12312312 9.146 8.635 0.504 0.0061 0.617

about the bond centers. ~b! We can deal separately with the
bottom band and the top three bands; the latter would be
considered as a group, while the former is a single isolated
band. The solution at the minimum should resemble atomic
orbitals for the more electronegative species ~the As anion!,
in the form of three p orbitals and one s orbital, respectively.
~c! Finally, it might be interesting to consider the case in
which the four bands are treated together, but using the so-
lution of the V minimization for the one-band and three-
band cases, without proceeding further with the minimiza-
tion. This does not correspond to a true minimum for the
four-band V surface, but just to a stationary ~saddle! point.
Starting with the case in which all the four bands are
treated as a group, we show in Table II the convergence of
the spread functional and its various contributions as a func-
tion of the density of the k-point sampling. In analogy with
the case of Si, the procedure is initialized using trial Gauss-
ians of width 1 Å, centered in the middle of the bonds; this is
again a very good starting guess, and ~for the 83838 mesh! FIG. 2. Maximally localized Wannier function in GaAs, for the
gives an initial V D50.1164 and V OD50.593, that are re- 83838 k-point sampling. ~a! Profile along the Ga-As bond. ~b!
duced to 0.0059 and 0.555, respectively, by the minimization Contour plot in the ~110! plane of the bond chains. The other Wan-
procedure. As was the case for Si, k-point convergence is nier functions lie on the other three tetrahedral bonds and are re-
fairly slow, even though most of it is due to the slow con- lated by tetrahedral symmetries to the one shown.
vergence of the invariant part. On the other hand, the general
shape of the Wannier functions at the minimum is already but they have moved towards the As, at a position that is
given rather accurately with coarser samplings ~although the 0.617 times the Ga-As bond distance. It should be noted that
tails are then not so easy to characterize, since in practice the these Wannier functions are also very similar to the localized
Wannier functions are periodically repeated in a supercell orbitals that are found in linear-scaling approaches,61 where
conjugate to the k-point mesh!. In particular, the k-point con- orthonormality, although not imposed, becomes exactly en-
vergence of the Wannier centers is quite rapid, as is evident forced in the limit of an increasingly large localization re-
from the last column of Table II, where we show the relative gion. This example highlights the connections between the
position of the centers along the Ga-As bonds. Here b is the two approaches. The characterization of the maximally local-
distance between the Ga atom and the Wannier center, given ized Wannier functions indicates the typical localization of
as a fraction of the bond length ~in Si the centers were fixed the orbitals that can be expected in the linear-scaling ap-
by symmetry to be in the middle of the bond, b 50.5, irre- proach. Moreover, such information ought to be extremely
spective of the sampling!. valuable in constructing an intelligent initial guess at the
In Fig. 2, we present plots showing one of these maxi- solution of the electronic structure problem in the case of
mally localized Wannier functions in GaAs, for the 83838 complex or disordered systems.
k-point sampling. Again, at the minimum V, all four Wan- As pointed out before, in GaAs we can have different
nier functions have become identical ~under the symmetry choices for the Hilbert spaces that can be considered, so we
operations of the tetrahedral group!, and they are real, except also studied the case in which only the bottom band, or the
for an overall complex phase. The shape of the Wannier top three, are used as an input for the the minimization pro-
functions is again that of s p 3 hybrids combining to form cedure. Table III shows the spread functional and its various
s -bond orbitals; inversion symmetry is now lost, but the contributions for these different choices, where the bottom
overall shape is otherwise closely similar to what was found band is first treated as isolated; next the three p bands are
in Si. The Wannier centers are still found along the bonds, treated as a separate group; then these two solutions are used
56 MAXIMALLY LOCALIZED GENERALIZED WANNIER . . . 12 859

TABLE III. Localization functional V and its decomposition in TABLE IV. Coordinates ~in Å! of the atoms and of the six
invariant, off-diagonal, and diagonal parts, for the case of GaAs Wannier centers in the ethylene molecule.
~units are Å 2 ). The bottom valence band, the top three valence
bands, and all four bands are separately included in the minimiza- Species x y z
tion. The star ( * ) refers to the case in which the minimization is not
actually performed, and the solution for the one-band and three- H 21.235 0.936 0.000
band cases is used. Sampling is performed with a 83838 mesh of H 1.235 20.936 0.000
k points. H 1.235 0.936 0.000
H 21.235 20.936 0.000
k set V VI V OD VD C 0.660 0.000 0.000
C 20.660 0.000 0.000
one band 1.968 1.944 0 0.0238
three bands 10.428 9.844 0.560 0.0245 WF r̄ x r̄ y r̄ z
four bands * 12.396 8.038 4.309 0.0483
four bands 8.599 8.038 0.555 0.0059 1 21.049 0.622 0.000
2 1.049 20.622 0.000
3 1.049 0.622 0.000
to construct a four-band solution, without further minimiza- 4 21.049 20.622 0.000
tion; and finally, this is compared with the full four-band 5 0.000 0.000 0.327
minimization. In composing the results for the one-band and 6 0.000 0.000 20.327
three-band cases, we take the 131 and 333 unitary matri-
ces that would give the minimum solution for the one- and
three-band cases, and build from them a set of 434 block- stressed that only when all the four bands are treated simul-
diagonal unitary matrices. The four-band V that is obtained taneously do we achieve the overall maximum localization.
is exactly the sum of the two initial V’s. Nevertheless, the This reinforces the picture in which the maximally localized
bookkeeping changes: V I is reduced, with an equal and op- orbitals correspond to the most natural ‘‘chemical bonds’’ in
posite contribution reappearing in V OD . ~The V D’s sum up the system.
exactly, as they must.! If we then minimize this ~saddle-
point! solution, we recover the four-band minimum: the in- C. Molecular C 2 H 4
variant part ~obviously! does not change, V D slightly de-
creases, with a larger reduction in V OD , in correspondence We have also studied the case of the ethylene molecule
to an increased interband mixing. ~C2H4!, in order to make the connection with some standard
In Fig. 3, we show the contour plot for the maximally chemistry concepts, and to highlight the relation of our for-
localized one-band Wannier function in GaAs, for the malism ~derived from a k-space representation of extended
83838 k-point sampling. The function is again real, and it Bloch orbitals! to the case of an isolated system as discussed
shows the typical characteristics of an s orbital centered in Sec. IV C 4. First of all, the molecule is modeled in peri-
around the anion; the tetrahedral symmetry of the lattice de- odic boundary conditions, in a supercell that is large enough
forms the spherical orbital, introducing contributions that to make the interaction with the periodic images negligible.
point along the two bond chains @one in the ~110! plane Consequently, the band dispersion becomes also negligible,
plotted, and one perpendicular to that plane#. In the three- and G sampling is all that is needed for total energies, forces,
band case, on the other hand, the Wannier functions re- and densities. However, the spread functional is expected to
semble three orthogonal atomic p orbitals. It should be converge slightly slower with k-point sampling, as discussed
in Sec. IV C 4. We thus tested several k-point meshes. For
the single k-point case, the mesh in reciprocal space is that
formed by the G point and all its periodical images, i.e., the
reciprocal lattice vectors; our formalism remains equally ap-
plicable to such a case. One should bear in mind that if the
supercell is not cubic, appropriate weight factors have to be
added in the calculation of the derivatives ~see Appendix B!.
We show in Table IV the coordinates for the C and H
atoms at the structural minimum, together with the Wannier
centers. In this molecule, there are six occupied valence
eigenstates, the lowest five being of C—H or C—C
s -bonding character, and the top ~frontier! orbital being of
C—C p -bonding character. If we treat the lowest five bonds
as a composite group, we find as expected that the minimi-
zation of V leads to s -bond orbitals located on each of the
C—H or C—C bonds. However, treating all six bands to-
gether, we find that the C—C p -bonding orbital mixes
FIG. 3. Contour plot, in the ~110! plane, of the maximally lo- strongly with the C—C s -bonding orbital to give two Wan-
calized Wannier function in GaAs for the 83838 k-point sam- nier functions that are symmetrically disposed above and be-
pling when only the bottom valence band is considered. low the x-y plane. Contour plots for the resulting C—H and
12 860 NICOLA MARZARI AND DAVID VANDERBILT 56

TABLE V. The functional V and its decomposition, with in-


creasing k-point sampling, for ethylene ~units are Å 2 ).

k set V VI V OD VD

13131 4.041 3.657 0.384 0


23232 4.503 4.124 0.380 6310 27
33333 4.600 4.222 0.377 3310 27

D. LiCl
It is also interesting to look at a more ionic system, to
understand the effect of electronegativity and band gap on
the location and localization of the Wannier functions. We
have studied rocksalt LiCl, treating all four valence bands
~roughly Cl 3s and 3p) as a unit, and again using an
83838 k-point sampling.
One could expect the Wannier functions to localize much
more strongly around the anion than was the case for GaAs,
and indeed this is what we find. However, we also find that
the Wannier functions can reduce V further by mixing to
form sp 3 hybrids, sitting on the vertices of a tetrahedron
centered around the Cl atom, with each center at a distance
of 0.449 Å from the Cl ~the Li-Cl distance being 2.57 Å!. We
anticipated that these hybrids might prefer to align along the
$111, 1̄ 1̄ 1, 1̄ 1 1̄ , 1 1̄ 1̄ % or $1 1 1̄ , 1 1̄ 1, 1̄ 11, 1̄ 1̄ 1̄ % sets of
directions; if this were the case, the choice between the two
sets ~two degenerate global minima of V! would constitute a
kind of unphysical or ‘‘anomalous’’ symmetry breaking
from cubic to tetrahedral. Instead, we find that V is, at least
to our machine precision, rotationally invariant with respect
to the orientation of the sp 3 hybrids, just as would be the
case for an isolated Cl2 ion in free space. This implies that
the tetrahedron of the Wannier centers around each Cl atom
is free to rotate without any discernible decrease of localiza-
FIG. 4. Contour plots for the maximally localized Wannier tion.
functions in ethylene, C 2 H 4 . ~a! One of the four C—H Wannier Finally, consistent with the idea that a larger gap is linked
functions, shown in the x-y plane. ~b! One of the two CvC Wan- to a higher degree of localization, we find a total V54.159
nier functions, shown in x-z plane. Å2, with V I53.354, V OD50.805 and V D51.2310 25 Å2.

CvC Wannier functions are shown in Fig. 4, and the loca-


VII. DISCUSSION
tions of the Wannier centers are reported in Table IV. The
picture that emerges from this ‘‘natural’’ symmetry breaking We have discussed a technique for obtaining a set of well-
of the planar geometry is just the Lewis picture of the CvC localized Wannier functions for a given band or composite
double bond. set of bands in a crystalline solid. We have in mind several
In our calculations we have used a cubic supercell of side kinds of applications for this method.
7 Å; this gives to each band a dispersion that is always First, we believe that this approach may help to obtain
smaller than 0.02 eV, and that originates from the interaction chemical intuition about the nature of chemical bonds in sol-
with the superperiodic images. Increasing the k-point sam- ids, and to characterize trends in bonding properties within
pling has negligible effects on the equilibrium positions of classes of solids. As emphasized in the introduction, the
the C and H atoms and on the location of the Wannier cen- Wannier functions defined here are the natural generalization
ters. But it does still affect the localization functional, which of the concept of ‘‘localized molecular orbitals’’21–26 to the
displays a slower convergence with respect to the number of case of solids. As illustrated in the examples of GaAs and
k points used ~although much faster than was the case for Si ethylene ~C 2 H 4 ) above, the determination of the Wannier
or GaAs!. The results are summarized in Table V, where we functions can give chemical intuition into the nature of the
show the V contributions for the maximally localized Wan- bond orbitals of the material, including the spontaneous sym-
nier functions with increasing k-point sampling. It is readily metry breaking that occurs in the Lewis picture of a double
seen that the slow convergence is coming mostly from the or triple bond. We also suspect that it may be instructive to
invariant part of the functional; a finer k-point mesh provides generate, characterize and plot the Wannier functions across
both a more detailed sampling of the Brillouin Zone and a a series of compounds, e.g., for II–VI semiconductors as one
more accurate calculation of the gradients. varies from wide- to narrow-gap members, or in cubic per-
56 MAXIMALLY LOCALIZED GENERALIZED WANNIER . . . 12 861

ovskites of varying composition. Moreover, as emphasized * of 2.04, in good agreement with


positions, we find a total Z Ga
by Hierse and Stechel,10 the Wannier functions may be trans- the established theoretical value of 1.99 as calculated by
ferable to a considerable degree for similar bonds in different linear-response methods.66 Moreover, in arriving at the total
chemical systems ~for example, for C—H or C—C bonds in electronic Z Ga* ,el520.96, we find contributions of 21.91,
a variety of hydrocarbons!. It should be noted, however, that 10.65, and 10.30 from the groups of four first-neighbor, 12
this is even more likely to be true for nonorthogonal second-neighbor, and remaining further-neighbor Wannier
Wannier-like functions,10 as opposed to the orthogonal ones centers, respectively. It is interesting to note that inclusion of
studied here. nearest-neighbor contributions alone would thus significantly
Second, it is possible that the Wannier functions may overestimate the magnitude of Z Ga * ,el , and that the second-
prove suitable as a basis for use in constructing theories of
neighbor Wannier centers move in the opposite direction to
interacting or strongly correlated electron systems. For ex-
the Ga atom motion. If we repeat the calculation displacing
ample, it might be possible to build good approximate cor-
related wave functions from sums of Slater determinants of one As atom, we obtain a total Z As * of 22.07 ~the acoustic
the Wannier functions. For this purpose, one would clearly sum rule67 is only approximately satisfied with a finite
need to choose a set of bands that includes some low-lying k-point sampling!. The total electronic Z As * ,el5-7.07 has now
unoccupied states of the one-particle mean-field Hamil- contributions of 21.74, 24.63, and 20.71 from the groups
tonian. Similarly, it might be possible to build accurate of four first-neighbor, 12 second-neighbor, and remaining
model Hamiltonians for magnetic systems, or for transport further-neighbor Wannier centers, respectively.
properties of metals. ~Again, for metals it would appear nec- In fact, the pattern of displacements of the Wannier cen-
essary to choose a composite group of bands that brackets ters can be regarded as defining a kind of coarse-grained
the Fermi level, and to specify the occupation as a kind of representation of the polarization field, P(r). To illustrate
density matrix in the Wannier indices.! this idea more directly, we have carried out a calculation for
Third, the present scheme might prove useful for predict- bulk GaAs in which a long-wavelength transverse optical
ing the suitability of linear-scaling methods for different ~TO! phonon has been frozen in. We take the wave vector
kinds of insulating materials. Since the linear-scaling q5( p /4a)(x̂1ŷ) (a is the lattice parameter! and relative
methods5 depend strongly on the localization properties of displacements j (r)5 j 0 sin(q–r)ẑ in a 16-atom supercell,
the Wannier functions ~or, closely related, the density ma- composed of eight unit cells repeated in the ~110! direction.
trix!, the present scheme might be a simple and useful way to We assign a displacement amplitude j 0 50.01a to the Ga
characterize the degree of localization for a given target ma- sublattice, and 2 j 0 M Ga /M As to the As sublattice (M Ga and
terial. This information might then help predict whether the M As are the masses of the two species; the center of mass
material is a good candidate for a linear-scaling method; and doesn’t move!. Observing the resulting displacements of the
if so, what type of linear-scaling method is likely to work Wannier centers, we can obtain a picture on how the local
best, and what real-space cutoff parameter is likely to be polarization changes from cell to cell ~say, by summing all
required. the four Wannier centers surrounding one As atom!; fitting
Finally, an important feature of the present approach is these to the same form P(r)5 P 0 sin(q•r)ẑ, we obtain a
that it generates a list of the locations of the Wannier centers. P 0 50.249, and, via the acoustic sum rule (Z Ga * ,el
This information alone can often be of crucial importance. In 1Z As* 528), we get Z Ga
,el
* 521.52 and Z As
,el
* 526.48.
,el

fact, we envisage a number of interesting applications in These results are only in fair agreement with the bulk values;
which one essentially throws away all other information the discrepancies might be due to the finite size of our su-
about the Wannier functions, keeping only their locations. percell, or to not having used the proper eigenvector for the
For example, the shift of the Wannier center away from the phonon mode considered. However, the main point of this
bond center might serve as a kind of measure of bond ionic- demonstration is that, given the calculation on the supercell
ity. Also, the vector sum of the Wannier centers immediately containing the frozen TO phonon, there is no other way that
gives the bulk electronic polarization P; all three Cartesian the transverse component of the polarization field could have
components of P can thus be determined simultaneously us- been obtained. Since the mode is transverse, P(r) cannot be
ing a conventional k mesh, instead of constructing separate determined from the charge density; since qÞ0, the Berry-
special k-point strings to compute each separate Cartesian phase approach does not apply; and since the displacement is
component of P as is needed otherwise.15 finite, the linear-response approach is not directly applicable.
But more importantly, the information on the locations of However, the present scheme allows a direct finite-difference
the Wannier functions may open the possibility of calculat- calculation of the transverse polarization field, a quantity that
ing properties that cannot otherwise be obtained, especially was previously unavailable.
for distorted, defective, or disordered systems. For example, It would be interesting to apply this kind of analysis to
it becomes possible not only to calculate the Born ~dynami- supercell simulations of amorphous systems such as a-H 2 O
cal! effective charge Z * , but also to decompose it into dis- or a-GaAs. Once again, while only the longitudinal part of
placements of individual neighboring Wannier centers. To P„r… can be determined from the charge density, a similar
illustrate this idea, we have carried out a calculation on a determination of both the longitudinal and transverse com-
cubic supercell of GaAs containing 64 atoms (G-only ponents is possible with access to the displacements of the
k-point sampling!, in which all atoms are in their equilibrium Wannier centers, thus leading to a more complete theory of
positions except for one Ga atom that is displaced by 0.1 the dielectric properties of such systems. This information
Å along the @ 111# direction. Observing the consequent dis- might be used to assist the approach of Ref. 51, in which the
placement of the Wannier centers from their bulk crystalline infrared absorption spectrum of an amorphous system is ex-
12 862 NICOLA MARZARI AND DAVID VANDERBILT 56

tracted from a molecular-dynamics simulation. As a limited ~where P5 ( i u i &^ i u and Q512 P) and a remainder
test, we have carried out calculations for a 64-atom supercell Ṽ5 ( a ( iÞ j u ^ i u r a u j & u 2 . Defining matrices X i j 5 ^ i u x u j & ,
of crystalline Si with random displacements typical of X D,i j 5X i j d i j , X 8 5X2X D , and similarly for Y and Z, this
;1000 K, and find that the calculation of the displaced Wan- can be rewritten
nier centers is straightforward.
Finally, we conclude by pointing out that our work opens Ṽ5tr@ X 8 2 1Y 8 2 1Z 8 2 # . ~A1!
numerous possibilities for further development and future
study. On a practical level, it might be useful to explore the Thus if X, Y , and Z could be simultaneously diagonalized,
use of more accurate, higher-order finite-difference formulas then Ṽ could be minimized to zero, but for noncommuting
for ¹ k ~see Appendix B! to see whether convergence with matrices this is not possible. In a sense, our job is to perform
respect to k-point sampling can be improved. It might be the optimal approximate simultaneous codiagonalization of
interesting to apply our analysis within the semiempirical the three Hermitian matrices X, Y , and Z by a single unitary
tight-binding context, although it should be noted that matrix transformation. We are not aware of a formal solution for
elements of x, y, and z ~and, for V I , also of r 2 ) would be this problem, but a steepest-descent numerical solution is
needed, in addition to the Hamiltonian and overlap matrix fairly straightforward. Since tr@ X 8 X D# 50, etc.,
elements. Going beyond the scope of the present work, it
might be interesting to explore other localization criteria, dV52 tr@ X 8 dX1Y 8 dY 1Z 8 dZ # . ~A2!
e.g., the maximization of the Coulomb self-interaction of the We consider an infinitesimal unitary transformation
Wannier functions. It would also be of great interest to de- u i & → u i & 1 ( j W ji u j & ~where dW is anti-Hermitian!, from
velop a corresponding theory of maximally localized nonor- which dX5 @ X,dW # , etc. Inserting in Eq. ~A2! and using
thogonal Wannier-like functions. ~While the direct connec- tr†A @ B,C # ‡5tr†C @ A,B # ‡ and @ X 8 ,X # 5 @ X 8 ,X D# , we obtain
tion to the polarization properties would be lost, there would dV5tr@ dWG # , where
be important implications for some linear-scaling algo-
rithms.! Finally, there are many questions of a mathematical G52 $ @ X 8 ,X D# 1 @ Y 8 ,Y D# 1 @ Z 8 ,Z D# % , ~A3!
character that deserve further study. For example, is it pos-
sible to prove that our Wannier functions ~those that mini- so that the desired gradient is dV/dW5G as given above.
mize V) have exponential decay, even in the general non- The minimization can then be carried out using steepest de-
centrosymmetric multiband case? Are they always real, as scents following the general approach outlined in Sec. IV D.
conjectured in Sec. V B? And are there further results that More sophisticated but related methods are discussed in Ref.
can be derived regarding the interrelations between the met- 26.
ric tensor, the Berry connection, and the Berry curvature, as If this approach is applied to a finite system having a
discussed in Appendix C? We hope that our work will stimu- crystalline interior, the solutions in the interior are expected
late some investigations of these questions. to correspond precisely with the maximally localized Wan-
nier functions as determined using the k-space methods of
the main text. In the vicinity of surfaces or defects, or for
ACKNOWLEDGMENTS disordered materials, the solutions will essentially corre-
This work was supported by NSF Grants Nos. DMR-96- spond to the ‘‘generalized Wannier functions’’ discussed by
13648 and ASC-96-25885. We would like to thank R. Resta previous authors.27,55–57
for calling our attention to Refs. 42–44; E. Stechel for point-
ing out the connection to Refs. 21–26; and W. Kohn, Q. Niu, APPENDIX B: FINITE-DIFFERENCE FORMULAS
and R. Resta for illuminating discussions. FOR k-SPACE GRIDS

We assume that the Brillouin zone has been discretized


APPENDIX A: MINIMIZATION OF SPREAD into a uniform Monkhorst-Pack mesh.65 Let b be a vector
FUNCTIONAL IN REAL SPACE connecting a k point to one of its near neighbors, and let Z be
the number of such neighbors to be included in the finite-
In Sec. III above, the problem of finding the optimally difference formulas. We seek the simplest possible finite-
localized Wannier functions for a periodic system was for- difference formula for ¹ k , i.e., the one involving the small-
mulated directly in real space. In this Appendix, we briefly est possible Z. When the Bravais lattice point group is cubic,
reformulate the problem for the case of a finite system ~clus- it will only be necessary to include the first shell of Z56, 8,
ter, molecule, etc.!, and sketch how the minimization of the or 12 k neighbors for simple cubic, bcc, or fcc k-space
functional can be performed in this case. This provides a meshes, respectively. Otherwise, further shells must be in-
complementary perspective to the k-space procedure dis- cluded until it is possible to satisfy the condition
cussed in the main text.
We change notation u Rn & → u i & and now refer to the i as
‘‘localized orbitals’’ rather than ‘‘Wannier functions,’’ but (b w b b a b b 5 d ab ~B1!
their meaning is the same: they are a set of orthonormal
orbitals spanning the Hamiltonian eigenstates in an energy by an appropriate choice of a weight w b associated with each
range of interest ~e.g., for the occupied valence states of a shell u bu 5b. For the three kinds of cubic mesh, Eq. ~B1! is
molecule or cluster!. satisfied with w b 53/Zb 2 ~single shell!. Taking next the
Following the approach of Sec. III, we decompose slightly more complicated case of an orthorhombic lattice,
V5 ( i @ ^ r 2 & i 2 r̄ 2i # into an invariant part V I5 ( a tr@ Pr a Qr a # one can let b run over the two nearest neighbors in each
56 MAXIMALLY LOCALIZED GENERALIZED WANNIER . . . 12 863

Cartesian direction (Z56), with w b 51/2b 2x for the two


neighbors at 6b x x̂, etc. Even in the worst case of minimal
g ab 5Re (n ^ c n, au c n, b & 2 (
mn
^ c n, a u c m &^ c m u c n, b & , ~C5!
~triclinic! symmetry, only six pairs of neighbors (Z512)
should be needed, as the freedom to choose six weights which reduces in the single-band case to the expression of
should allow one to satisfy the six independent conditions Pati.42
comprising Eq. ~B1!. From Eq. ~C1! it is obvious that the distance, and thus the
Now, if f (k) is a smooth function of k, its gradient can be metric, are gauge-invariant quantities. These are therefore
expressed as intrinsic properties of the manifold. One way of thinking
about the metric is to observe that for any given path in l
space, the line integral of g 1/2 along the path provides a
¹ f ~ k! 5 (b w b b@ f ~ k1b! 2 f ~ k!# . ~B2! measure of the total ‘‘quantum distance’’ along the path;
intuitively, it is a measure of the amount of change of char-
We can check the correctness of this finite-difference for- acter of the states as one traverses the path. The physical
mula by applying it to the case of a linear function meaning of this distance for the case of temporal evolution
f (k)5 f 0 1g•k, for which we find ¹ a f (k) of quantum states is discussed in Refs. 42–44.
5 ( bw b ( b b a g b b b 5g a . In a similar way, The second type of geometric object that can be defined is
a ‘‘geometric phase’’ or ‘‘Berry phase.’’45 Here, one is in-
u ¹ f ~ k! u 2 5 (b w b @ f ~ k1b! 2 f ~ k!# 2 . ~B3! terested in considering closed paths in l space, and relating
the phase ~or, for the multistate case, the unitary rotation!
induced by adiabatic ~‘‘parallel’’! transport along the path.
We note that improved accuracy and k-set convergence The multistate ~‘‘non-Abelian’’! case has been discussed by
might be obtained by utilizing improved, higher-order finite- Wilczek and Zee46, Mead,47 and Resta.20 One can define a
difference formulas involving more shells of neighboring ~non-gauge-invariant! Berry connection
k-points, but we have not explored this possibility here.
A a ,nm 5i ^ c n u c m, a & ~C6!
APPENDIX C: GEOMETRIC PROPERTIES AND
COMPLEXITY OF ELECTRON BANDS and a ~gauge-covariant! Berry curvature

Consider a manifold of J orthonormal states u c n (l) & , nm


B ab 52 ] a A b ,nm 1 ] b A a ,nm 1i @ A a ,A b # nm . ~C7!
n51, . . . ,J, depending on a continuous d-dimensional pa-
rameter l. Alternatively, one can view these as representing The invariants of the latter, such as
the projection P(l)5 ( n u c n (l) &^ c n (l) u . For the applica-
tion to electron bands in crystals, we identify l→k and
c n (l)→u nk . Here, we investigate the geometric properties
trB ab 52 Im (n ^ c n, au c n, b & , ~C8!

of such a manifold, generalizing the single-state (J51) re-


@see Eq. ~3.29! of Ref. 20# are thus gauge invariant. ~We
sults of Refs. 42–44 to the multistate case.
shall use the notation ‘‘tr’’ and ‘‘Tr’’ to denote electronic
One can define two kinds of intrinsic geometric proper-
and Cartesian traces, respectively.!
ties: a geometric distance and a geometric phase. We con-
There is a tantalizing similarity between the metric g ab ,
sider the former first. The geometric distance D 12 between
Eq. ~C5!, and the quantum trace of the Berry curvature, Eq.
two points l 1 and l 2 is here taken to be
~C8!. In fact, defining the gauge-invariant quantity
D 2125tr@ P 1 Q 2 # 5 21 i P 1 2 P 2 i 2 , ~C1!
where Q(l)512 P(l). In the case of a single state, this Fab 5 (n ^ c n, au Q u c n, b & ~C9!
becomes D 212512 u ^ c 1 u c 2 & u 2 , which for small separations is
consistent with the slightly different definition where again Q512 P, and using Eq. ~C3! to show that the
D 2125222 u ^ c 1 u c 2 & u of Ref. 42. Considering the distance for second term in Eq. ~C5! is intrinsically real, we obtain sim-
infinitesimal separations, one can define a Riemannian ply g ab 5ReFab and trB ab 52 ImFab . This suggests that
metric,42 there may be some deep connections between the two
quantities.42–44 In the case where the states c n are eigenstates
of a Hamiltonian H(l), one moreover has20
2
D l,l1dl 5 (
ab
g ab dl a dl b . ~C2!
J `
^ c n u H a u c m &^ c m u H b u c n &
Introducing the notation c n, a 5d c n /dl a , etc., and making Fab 5 ( (
n51 m5J11 ~ E n 2E m ! 2
, ~C10!
use of
where H a 5dH(l)/dl a .
05 ^ c n u c m, a & 1 ^ c n, a u c m & , ~C3!
We now return to the case of electron bands in crystals,
05 ^ c n u c m, ab & 1 ^ c n, ab u c m & 12 Re^ c n, a u c m, b & , ~C4! l→k and c n (l)→u nk , and discuss the geometric properties
induced by the band projection operator P „k…. Note that g, A,
which follow from the fact that the c n remain orthonormal at and B have units of l 2 , l, and l 2 , respectively. Again focusing
first and second order in dl, the metric g ab becomes, after first on the metric, and comparing Eq. ~34! with the defini-
some manipulation, tions ~C1! and ~C2!, we find
12 864 NICOLA MARZARI AND DAVID VANDERBILT 56

V I5
1
((
N k,b
ab
w b g ab b a b b ~C11! ^ 0m u ru Rn & 5
V
~ 2p !3
EBZ
dkAmn ~ k! e 2ik•R. ~C16!

or, using Eq. ~B1! and restoring the continuum limit, The right-hand side is just Amn (R), the Fourier coefficient of
the Berry curvature. Eq. ~C15! is just the expression for the
V I5
V
~ 2p !3
EBZ
dk Trg ~ k…, ~C12! position of the Wannier center, which contributes to the elec-
tronic polarization.3,15,17,20 Moreover,

(n ~ 2 p ! 3 EBZdku Ann~ k! 2 r̄ nu 2 ,
where the integral is over the Brillouin zone. Thus, the in- V
variant part of the spread functional is nothing other than the Ṽ D5 ~C17!
Brillouin-zone average of the trace of the metric!
It may be interesting to see whether other global proper-
3 E dku Amn ~ k ! u .
V
ties of the metric might be given some physical interpreta-
tion. In particular, we define a dimensionless and gauge-
Ṽ OD5 (
mÞn ~ 2 p ! BZ
2
~C18!
invariant quantity,
Eqs. ~C17!–~C18! show that the noninvariant parts of the
C5 E BZ
dk det1/2g ~ k! . ~C13!
spread functional are also conveniently written in terms of
the Berry connection. If the above equations are reexpressed
in terms of the Fourier coefficients A mn (R), Eqs. ~19! and
We shall call this the ‘‘complexity’’ of the bands. Math- ~20! are immediately recovered.
ematically, it is really nothing other than the volume of the In the single-band case, we showed in Sec. IV C 2 that the
Brillouin zone as measured according to the metric g. How- minimum value of Ṽ could be related to the transverse part
ever, we have called it the ‘‘complexity’’ because it mea- of the Berry connection, which in turn is determined by the
sures the variation of the character of the band projection gauge-invariant Berry curvature. In the multiband case, the
operator P (k… throughout the Brillouin zone. Everything said Berry curvature B abmn
(k) is no longer gauge invariant, and it
here applies to any isolated band or composite group of is not obvious whether it is possible to make a corresponding
bands, but we have in mind primarily the case where all the decomposition. Nevertheless, one can derive similar corre-
occupied valence bands in an insulator are considered as a spondences as those above for A. So,
composite group. In this case, and assuming that one is only
interested in quantities ~such as total energies and forces! mn
B ab ~ k! 52i ^ u m, a u Q u u n, b & 1i ^ u m, b u Q u u n, a & , ~C19!
that can be expressed as a trace over the bands, the complex-
ity might thus be expected to reflect ~and even predict! the mn
B ab ~ R! 52i ^ u m u r a Qr b 2r b Qr a u u n & . ~C20!
number of k points needed for an accurate sampling of the
Brillouin zone. We have not tested this idea numerically, but Making use of r a Qr b 2r b Qr a 5 @ Pr a P, Pr b P # , one finds
this would clearly be an interesting avenue for future explo-
ration.
Turning now to phase properties, we note that a finite- i@ Pr a P, Pr b P #i 2c 5 (R (
mn
u B ab
mn
~ R! u 2
different representation of the Berry connection is

A a ,mn 5i (b k,b!
w b b a @ M ~mn 2 d mn # . ~C14!
5
V
~ 2p !3
EBZ
dki B ab ~ k!i 2 . ~C21!

Each form above is manifestly gauge invariant and positive


Restoring the continuum limit in k space, we can write definite. Thus, it can be seen that the Berry curvature will

E
V vanish if and only if the band-projected position operators
r̄ n 5 dkAnn ~ k! , ~C15! Px P, Py P, and Pz P commute with one another; as dis-
~ 2p !3 BZ cussed following Eq. ~17!, this is also just the condition that
and more generally, Ṽ vanishes at the minimum.

1
G. H. Wannier, Phys. Rev. 52, 191 ~1937!. 11
S. Goedecker and L. Colombo, Phys. Rev. Lett. 73, 122 ~1994!;
2
W. Kohn, Phys. Rev. 115, 809 ~1959!. S. Goedecker and M. Teter, Phys. Rev. B 51, 9455 ~1995!.
3
E. I. Blount, Solid State Phys. 13, 305 ~1962!. 12
J. Kim, F. Mauri, and G. Galli, Phys. Rev. B 52, 1640 ~1995!.
4
W. Kohn, Phys. Rev. B 7, 4388 ~1973!. 13
E. Hernández and M. J. Gillan, Phys. Rev. B 51, 10 157 ~1995!;
5
G. Galli, Curr. Opin. Solid State Mater. Sci. 1, 864 ~1996!. E. Hernández, M. J. Gillan, and C. M. Goringe, ibid. 53, 7147
6
G. Galli and M. Parrinello, Phys. Rev. Lett. 69, 3547 ~1992!. ~1996!.
7
F. Mauri, G. Galli, and R. Car, Phys. Rev. B 47, 9973 ~1993!; F. 14
P. Ordejón, E. Artacho, and J. M. Soler, Phys. Rev. B 53,
Mauri and G. Galli, ibid. 50, 4316 ~1994!. R10 441 ~1996!.
8 15
P. Ordejón, D. Drabold, M. P. Grumbach, and R. M. Martin, R. D. King-Smith and D. Vanderbilt, Phys. Rev. B 47, 1651
Phys. Rev. B 48, 14 646 ~1993!; ibid. 51, 1456 ~1995!. ~1993!.
9
W. Kohn, Chem. Phys. Lett. 208, 167 ~1993!. 16
D. Vanderbilt and R. D. King-Smith, Phys. Rev. B 48, 4442
10
W. Hierse and E. B. Stechel, Phys. Rev. B 50, 17 811 ~1994!. ~1993!.
56 MAXIMALLY LOCALIZED GENERALIZED WANNIER . . . 12 865

17
R. Resta, Rev. Mod. Phys. 66, 899 ~1994!. 41
A. Nenciu and G. Nenciu, J. Phys. A 15, 3313 ~1982!; G. Nenciu,
18
G. Ortiz and R. M. Martin, Phys. Rev. B 49, 14 202 ~1994!. Rev. Mod. Phys. 63, 91 ~1991!.
19
R. W. Nunes and D. Vanderbilt, Phys. Rev. B 50, 17 611 ~1994!. 42
A. K. Pati, Phys. Lett. A 159, 105 ~1991!.
20
R. Resta, Berry Phase in Electronic Wavefunctions, Troisième 43
J. Anandan and Y. Aharonov, Phys. Rev. Lett. 65, 1697 ~1990!.
Cycle Lecture Notes ~Ecole Polytechnique Fédérale, Lausanne, 44
A. K. Pati and A. Joshi, Phys. Rev. A 47, 98 ~1993!.
Switzerland, 1996!; also available at http://ale2ts.ts.infn.it:6163/ 45
M. V. Berry, Proc. R. Soc. London, Ser. A 392, 45 ~1984!.
;resta/publ/notes_trois.ps.gz 46
F. Wilczek and A. Zee, Phys. Rev. Lett. 52, 2111 ~1984!.
21
S. F. Boys, Rev. Mod. Phys. 32, 296 ~1960!. 47
C. A. Mead, Rev. Mod. Phys. 64, 51 ~1992!.
22
J. M. Foster and S. F. Boys, Rev. Mod. Phys. 32, 300 ~1960!. 48
M.-C. Chang and Q. Niu, Phys. Rev. B 53, 7010 ~1996!.
23 49
C. Edmiston and K. Ruedenberg, Rev. Mod. Phys. 35, 457 G. Weinreich, Solids: Elementary Theory for Advanced Students
~1963!. ~Wiley, New York, 1965!, Chap. 8.
24 50
S. F. Boys, in Quantum Theory of Atoms, Molecules, and the This discussion also applies to the k-point sampling in the
Solid State, edited by P.-O. Löwdin ~Academic Press, New surface-normal direction for periodic slab supercell calculations.
51
York, 1966!, p. 253. A. Debernardi, M. Bernasconi, M. Cardona, and M. Parrinello,
25
H. Weinstein, R. Pauncz, and M. Cohen, Adv. At. Mol. Phys. 7, Appl. Phys. Lett ~to be published!; P. L. Silvestrelli, M. Ber-
97 ~1971!. nasconi, and M. Parrinello, Chem. Phys. Lett. ~to be published!.
26
J. M. Leonard and W. L. Luken, Theor. Chim. Acta 62, 107 52
G. Nenciu, Commun. Math. Phys. 91, 81 ~1983!.
~1982!; Int. J. Quantum Chem. 25, 355 ~1984!. 53
J. des Cloizeaux, Phys. Rev. 135, A685 ~1964!.
27
W. Kohn and J. Onffroy, Phys. Rev. B 8, 2485 ~1973!. 54
J. des Cloizeaux, Phys. Rev. 135, A698 ~1964!.
28
P. W. Anderson, Phys. Rev. Lett. 21, 13 ~1968!. 55
J. J. Rehr and W. Kohn, Phys. Rev. B 10, 448 ~1974!.
29
D. Bullett, in Solid State Physics: Advances in Research and Ap- 56
M. R. Geller and W. Kohn, Phys. Rev. B 48, 14 085 ~1993!.
plications, edited by H. Ehrenreich and D. Turnbull ~New York, 57
A. Nenciu and G. Nenciu, Phys. Rev. B 47, 10 112 ~1993!.
Academic, 1980!, p. 129. 58
E. O. Kane and A. B. Kane, Phys. Rev. B 17, 2691 ~1978!.
30
W. M. C. Foulkes and D. M. Edwards, J. Phys.: Condens. Matter 59
C. Tejedor and J. A. Verges, Phys. Rev. B 19, 2283 ~1979!.
5, 7987 ~1993!. 60
M. R. Pederson and C. C. Lin, Phys. Rev. B 35, 2273 ~1987!.
31 61
B. Sporkmann and H. Bross, J. Phys.: Condens. Matter 9, 5593 P. Fernández, A. Dal Corso, A Baldereschi, and F. Mauri Phys.
~1997!. Rev. B 55, R1909 ~1997!.
32 62
We find empirically that the maximally localized Wannier func- See, e.g., R. O. Jones and O. Gunnarsson, Rev. Mod. Phys. 61,
tions are always real, once the arbitrary overall phase is re- 689 ~1989!.
63
moved. This observation is discussed and justified in Sec. V B. M. C. Payne, M. P. Teter, D. C. Allan, T. A. Arias, and J. D.
33
H. Teichler, Phys. Status Solidi B 43, 307 ~1971!. Joannopoulos, Rev. Mod. Phys. 64, 1045 ~1992!; D. Vanderbilt,
34
H. Bross, Z. Phys. 243, 311 ~1971!. Phys. Rev. B 41, 7892 ~1990!.
35 64
S. Satpathy and Z. Pawlowska, Phys. Status Solidi B 145, 555 J. S. Lin, A. Qteish, M. C. Payne, and V. Heine, Phys. Rev. B 47,
~1988!. 4174 ~1993!.
36
B. Sporkmann and H. Bross, Phys. Rev. B 49, 10 869 ~1994!. 65
H. J. Monkhorst and J. D. Pack, Phys. Rev. B 13, 5188 ~1976!.
37
J. Callaway and A. J. Hughes, Phys. Rev. 156, 860 ~1967!. 66
S. de Gironcoli, S. Baroni, and R. Resta, Phys. Rev. Lett. 62,
38
D. A. Goodings and R. Harris, Phys. Rev. 178, 1189 ~1969!. 2853 ~1989!.
39
S. Kivelson, Phys. Rev. B 26, 4269 ~1982!. 67
R. Pick, M. H. Cohen and R. M. Martin, Phys. Rev. B 1, 910
40
Q. Niu, Phys. Lett. B 5, 923 ~1991!. ~1970!.

Das könnte Ihnen auch gefallen