You are on page 1of 26

Density Functional Tight Binding (DFTB):

Application to organic and biological molecules


Michael Gaus,

Qiang Cui,

and Marcus Elstner


,
Department of Chemistry and Theoretical Chemistry Institute, University of Wisconsin, Madison,
1101 University Avenue, Madison, Wisconsin 53706, USA, and Karlsruhe Institute of Technology,
Physical Chemistry, Kaiserstrasse 12, D-76131 Karlsruhe, Germany
E-mail: marcus.elstner@kit.edu

To whom correspondence should be addressed

University of Wisconsin

Karlsruher Institut fr Technologie


1
Abstract
In this work, we review recent extensions of the Density Functional Tight Binding (DFTB)
methodology and its application to organic and biological molecules. DFTB denotes a class of
computational models derived from Density Functional Theory (DFT) using a Taylor expan-
sion around a reference density. The rst and second order models, DFTB1 and DFTB2, have
been reviewed recently (WIREs Comput Mol Sci 2012, 2: 456-465). Here, we discuss the
extension to third-order, DFTB3, which in combination with a modication of the Coulomb
interactions in the second order formalism and a new parametrization scheme leads to a sig-
nicant improvement of the overall performance. The performance of DFTB2 and DFTB3 for
organic and biological molecules are discussed in detail, as well as problems and limitations
of the underlying approximations.
Introduction
Density Functional Tight Binding (DFTB) is a generic name for a set of computational models
derived from DFT. The starting point of the derivation is the reference density
0
of the molecular
system, which is constructed as a superposition of the neutral densities
a
0
of the atoms (a) that
constitute the system,

0
=

a
0
. (1)
The different DFTB models are derived by expanding the DFT total energy functional around
this density
0
in rst, second and third orders, respectively. The rst order terms constitute the
standard DFTB1 model, which originally was called simply DFTB,
1,2
while the model based on
the second order expansion, DFTB2, was originally called SCC-DFTB.
3
In the last years we have
derived third-order terms, leading to the DFTB3
48
model.
If the ground state density is written in terms of the reference density
0
and the density
2
uctuation ,
=
0
+, (2)
the DFTB total energy can be expanded in the respective orders as:
E[] = E
0
[
0
] +E
1
[
0
, ] +E
2
[
0
, ()
2
] +E
3
[
0
, ()
3
]. (3)
E
0
and E
1
constitute the DFTB(1) model, including E
2
denes DFTB2, and the inclusion of E
3
yields the DFTB3 model.
The different DFTB models have reasonably clear areas of application. DFTB1 is suitable for
systems, in which the charge transfer between atoms is small, such as homonuclear systems or
systems with atoms of similar electronegatitvity. Therefore, DFTB1 is well suited for the descrip-
tion of hydrocarbons for which the higher order terms are small. On the other hand, DFTB1 can
also treat systems where a complete charge transfer between the atoms occurs, as, for example, in
NaCl.
5
DFTB1 is 5-10 times faster than DFTB2/DFTB3 since it does not require a self-consistent
determination of the charge distribution; i.e., it requires a solution of the generalized eigenvalue
problem only once instead of 5-10 times on average for DFTB2/DFTB3. By contrast, the increase
in CPU time from DFTB2 to DFTB3 is negligible. The second order terms are crucial for polar
molecules, where only partial charge transfer occurs,
5
and the third-order expansion becomes in-
dispensable for charged molecular species,
6,7
as will be discussed in more detail below. Therefore,
for an application to biological molecules DFTB2 or DFTB3 is required. Since DFTB3 does not
imply any major increase in computational cost, we recently devised a new DFTB3 model by de-
riving a parameter set called 3OB, which we recommend to use in standard applications of DFTB
to biological molecules.
8
Three approximations follow the expansion of the total energy in Eq.3: (i) The energy contri-
bution E
0
[
0
] is approximated by a sum of pair potentials, which are tted for a set of molecules
to appropriate reference data. (ii) The Kohn-Sham orbitals
i
appearing in the term E
1
[
0
, ]
3
are expanded in a minimal atomic orbital basis

and a two-center approximation is applied to


the evaluation of the resulting integrals. (iii) The electron density uctuations appearing in the
second- and third-order terms are expanded with a multipole expansion. In the existing models,
this expansion is truncated after the monopole term, thus electron-electron interaction (Hartree
and exchange-correlation terms) is effectively approximated by the interaction of atomic partial
charges. This interaction is described by a Coulomb-term, which is damped at short interatomic
distances. A major improvement for non-bonded interactions has been achieved by identifying a
shortcoming of the original interaction term
3
and proposing a simple modication.
5
This modi-
cation is now used as a default in the DFTB3 model.
7,8
Since DFTB1 and DFTB2 have been reviewed in great detail in Ref. 9, this article focuses on
the extension to DFTB3 as outlined in Refs. 58 and on the performance of DFTB2 and DFTB3
for organic and biological molecules.
1
Theoretical Approach
Theory of the third-order SCC-DFTB: DFTB3
The extension of the DFTB approach to include third-order terms (DFTB3) has been introduced re-
cently
5,7
and will be briefely summarized in the following. The starting point to derive the DFTB3
total energy is the energy expression of the Kohn-Sham density functional theory.
11
Instead of
nding the electron density (rrr) that minimizes the total energy, a reference density
0
is assumed
and perturbed by some density uctuation, (rrr) =
0
(rrr) +(rrr). The exchange-correlation en-
ergy functional is then expanded in a Taylor series up to third order and the total energy can be
1
A very nice introduction into DFTB2 has also been given in Ref. 10.
4
written as,
E
dftb3
[
0
+] =
1
2
__

0
(rrr)
0
(rrr

)
|rrr rrr

|
drrrdrrr

_
V
xc
[
0
]
0
(rrr)drrr +E
xc
[
0
] +E
nn
+

i
n
i


H[
0
]

i
_
+
1
2
__
_
1
|rrr rrr

|
+

2
E
xc
[]
(rrr)(rrr

0
_
(rrr)(rrr

)drrrdrrr

+
1
6
_ __

3
E
xc
[]
(rrr)(rrr

)(rrr

0
(rrr)(rrr

)(rrr

)drrrdrrr

drrr

= E
0
[
0
] +E
1
[
0
, ] +E
2
[
0
, ()
2
] +E
3
[
0
, ()
3
].
(4)
Central to the performance of the DFTB models are several approximations following this
expansion, which are:
(i) E
0
[
0
] consists of the DFT double counting contributions and the nuclear-nuclear repul-
sion in the rst line of Eq.4 and depends only on the reference density
0
, which is given by the
superposition of neutral atomic densities (Eq.1). In other words, this term is not dependent on
the specic chemical environment; it can be determined for an appropriate reference system and
then applied to other molecules. This is the key to the transferability of the derived parameters. In
DFTB, this termis approximated by a sumof pair interactions referred to as the repulsive potential,
E
rep
=
1
2

ab
V
rep
ab
, (5)
(see Ref. 9), which is either determined by comparison to DFT calculations
1
or tted to empirical
data.
12
This approach neglects three-body contributions, which may become important in certain
cases, such as in condensed phase systems.
13
(ii) E
1
consists of the Hamiltonmatrix elements <
i
|

H
0
|
i
> in the second line of Eq.4:

H
0
is the Kohn-Sham Hamiltonian of the molecular system but with the reference density.
i
are the Kohn-Sham orbitals, which are represented in a minimal basis of pseudoatomic orbitals,

i
=

c
i

. This is a central part of the computational efciency since it reduces the size of the
5
eigenvalue problem signicantly.
E
1
[
0
, ] =

iab

b
n
i
c
i
c
i
H
0

(6)
where H
0

are the Hamilton matrix elements in the atomic orbital (AO) representation. The diago-
nal elements of H
0

are chosen to be atomic DFT eigenvalues evaluated with the PBE


14
exchange-
correlation functional, the off-diagonal elements are calculated in a two-center approximation.
9,15
This minimal basis approximation is at the core of the problems when it comes to the calculation of
response properties.
16,17
A single-zeta basis can be tuned quite well to reproduce the bonding prop-
erties of molecules. However, treatment of the more diffuse part of the density, which is relevant
to non-covalent interactions, is more challenging and requires more extended basis sets. Further,
polarization functions may become important for some systems, such as for nitrogen as discussed
below. After introducing the minimal basis, the resulting interaction integrals are approximated.
In particular, this concerns the neglect of 2-center integrals for the diagonal terms and three-center
integrals for the off-diagonal contributions; for a detailed discussion, see Ref. 9.
(iii) E
2
, the energy term in the third line of Eq.4, is approximated using only the monopole
term in an expansion of in spherical harmonics.
3
The charge density uctuations are then
written as a superposition of atomic contributions, =
a

a
, in which the spherical atomic
contributions are approximated by a simple Slater function with q
a
= q
a
q
0
a
(q
a
is the Mulliken
charge of atom a and q
0
a
the number of valence electrons of the neutral atom a) centered on the
nucleus at RRR
a
:

a
q
a

3
a
8
e

a
|rrrRRR
a
|
(7)
With this approximation, the Coulomb interaction of the second order term with respect to can
be expressed analytically and is abbreviated as
ab
in the following. The exponent
a
is chosen
such that the on-site value of the -function properly describes the atomic chemical hardness (or
alternatively the Hubbard parameter as calculated from DFT) and, therefore, implicitly takes into
account the exchange-correlation contribution to the second order term. To improve this interpo-
6
lation between long-range Coulomb interaction and the on-site term, further renements on the
-function have been applied.
7
As discussed in Refs.,
4,5
the derived function
ab
assumes a specic inverse relationship of
the chemical hardness U of an element with its atomic size. Although this relation holds well
within one row of the periodic table, it does not for elements of different rows. In particular,
hydrogen turned out to deserve special attention and we therefore proposed a modied
ab
function
referred to as
h
ab
.
57
This modied function can be applied within DFTB2, but it is not the default
option. Within DFTB3, it has become default
7,8
and therefore is a key ingredient of the DFTB3
methodology.
(iv) The E
3
term consists of diagonal and off-diagonal contributions. Originally, only the
diagonal terms have been included.
46
As in DFTB3, a monopole approximation is applied and this
term describes the change of the chemical hardness of an atom with its charge state.
5
Specically,
in third order a new parameter is introduced, the charge derivative of the chemical hardness, U
d
.
This parameter can be computed fromDFT or optimized in order to improve the performance of the
model. In DFTB3, U
d
contributes at the third-order through a -function, which is the derivative
of
ab
with respect to atomic charge. It is interesting to note that Giese et al.
18
showed within
the framework of a rigorous density-functional expansion method that the third-order contribution
does not add signicantly to accuracy, in contrast to our nding with calculations based on diverse
sets of molecules.
68
Therefore, the third-order terms in DFTB3 can be seen as a systematic way to
introduce the charge dependence to compensate for deciencies of intrinsic approximations within
the second order formalism, namely, the small size of the pseudo-atomic orbital basis, the xed
shape of the initial atomic densities
a
0
as well as the simplied density uctuation scheme.
With all these approximations the DFTB3 total energy is given by
E
dftb3
=
1
2

ab
V
rep
ab
+

iab

b
n
i
c
i
c
i
H
0

+
1
2

ab
q
a
q
b

h
ab
+
1
3

ab
(q
a
)
2
q
b

ab
. (8)
The derivative of this expression with respect to the molecular orbital coefcients, c
i
, leads to the
7
corresponding Kohn-Sham equations

c
i
_
H


i
S

_
= 0 with b and a, a, (9)
H

= H
0

+S

c
q
c
_
1
2
(
ac
+
bc
) +
1
3
(q
a

ac
+q
b

bc
) +
q
c
6
(
ca
+
cb
)
_
, (10)
where S

is the overlap matrix. The Hamilton matrix elements depend on the Mulliken charges,
which in turn depend on the molecular orbital coefcients Thus, these equations have to be solved
self-consistently.
Dispersion correction
Dispersion interactions play an important role in processes dominated by non-covalent interactions,
such as conformational transitions of biomolecules. In the DFTB framework, the rst attempt
19
to
include dispersion was to augment the DFTB2 energy with an empirical dispersion term, following
the similar strategy applied to Hartree-Fock energies; the results were promising
19
and stimulated
similar developments for pure DFT methods.
2023
However, due to the use of a minimal basis set of
atomic orbitals, which are slightly compressed with respect to atomic orbitals, the electron density
in DFTB2 is not well described for large distances, especially for the overlap of weakly interacting
densities which are essential to the description of van der Waals (vdW) interactions. The early
parametrization of DFTB2 for organic and biological molecules
2
led to an underestimation of dis-
tances in hydrogen and vdW bonded complexes. Therefore, an empirical dispersion correction has
been proposed which also contains a repulsive contribution in order to correct for this artifact.
24
The new DFTB3 parametrization 3OB corrects for this problem by using a slightly more extended
atomic orbital basis set, leading to a good description of non-covalently bonded complexes using
the original dispersion correction from Ref. 19. Recently, Grimme has parametrized his D3
23
cor-
rection for DFTB3, leading to an excellent performance of DFTB3-D3 for a large set of hydrogen
2
referred to as the mio set, see www.dftb.org
8
and vdW bonded molecules.
25
Treatment of Electron Spin
Standard DFTB is a closed-shell method and therefore exhibits large errors for open-shell systems.
Khler et al. have formulated an open-shell DFTB variant that includes spin-polarization effects
either in a collinear
26,27
or a noncollinear fashion.
28
Besides doubling the orbital set for spin up
and spin down electrons, an additional term is added to the total energy that takes into account the
Mulliken spin-population and atomic spin-polarization constants. The latter are calculated from
DFT as numerical difference of partially spin-polarized states in proximity of the spin-unpolarized
state of an atom. The collinear spin treatment improves the description of radicals of organic
molecules. However, for some systems the direction of spin-quantization varies signicantly in
space (e.g., antiferromagnetism), for which the noncollinear spin-polarization treatment is neces-
sary. Note that for the collinear case the amount of computation time doubles with respect to the
nonpolarized calculation, while for the noncollinear one the cost quadruples.
Inherited DFT problems
DFTB is derived from DFT and usually GGA functionals (PBE) are applied to compute the terms
in E
1
, E
2
and E
3
. Since E
0
only affects bond energies but not the electronic spectrum in total, ap-
plying higher level methods or experimental data for the determination of E
0
does not compensate
for most of the problems inherent to DFT-GGA, except for overbinding, which could be almost
entirely removed in DFTB3/3OB. The deciency of DFT-GGA in describing vdW interactions can
be compensated by using empirical dispersion corrections, as describe above. All other phenom-
ena related to the self-interaction problem (SIC) in DFT are retained in the DFTB model. This
is reected, for example, in the performance of DFTB for the subsets of problematic cases in the
GMTKN24
29
test set (SIE11, DARC, DC9), as discussed in Ref. 8. The self-interaction problem
shows up in many properties and is contained in the second- and third-order terms in DFTB. A
detailed analysis has been published recently.
30,31
The description of the balance between charge
9
delocalization and polarization, for example in charge transfer complexes, is also a challenge to
DFT. Rapacioli et al.
32
adapted recently a conguration interaction method, based on constrained
DFT calculations, into the DFTB approach. This allows one to investigate charge resonances in
molecular complexes and describe the proper dissociation behavior.
QM/MM coupling
DFTB has been combined with empirical force eld methods in a QM/MM framework as de-
scribed in Ref.
33
This scheme has further been extended to include also a continuum electrostatics
environment in the DFTB/MM-GSBP scheme,
34
which is useful to the study of chemical reactions
in large macromolecular systems.
35,36
For the interaction between QM and MM atoms, it is common to include both electrostatic
and van der Waals contributions;
3739
bonded-terms are also included when the partitioning is
across covalent bonds. In most biomolecular applications, electrostatics tend to dominate and
therefore it is essential that electrostatic interactions between QM and MM atoms are properly
described. For DFTB, the QM-MM electrostatic interaction is approximately calculated in the
original implementation
33
as the Coulombic interaction between the QM Mulliken charges (q
a
)
and MM point charges (Q
I
). The error due to this approximation can be signicant when QM
and MM atoms approach each other where charge penetration effect becomes important. As a
result, reactions that involve highly charged solutes/substrates are difcult to study with the original
DFTB/MM Hamiltonian.
40
The problem can be partially solved by enlarging the QM region, but
this introduces not only additional cost but also technical complications for cases that involve
highly mobile solvents, such as the need of changing QM/MM partitioning on the y.
41,42
In our recent work,
43
motivated by the Klopman-Ohno (KO) expression for the two-center two-
electron integrals in semi-empirical QM methods,
44
which also inspired the development of the

ab
kernel in the original DFTB, we have implemented a different Hamiltonian for the DFTB/MM
10
electrostatics. It takes the form,
H
QM/MM
elec,KO
=

aQM

AMM
q
a
Q
A
_
R
2
aA
+a
a
(
1
U
a
(q
a
)
+
1
U
A
)
2
e
b
a
R
aI
=

aQM

AMM

KO
q
a
Q
A
(11)
in which a
a
and b
a
are element type dependent parameters. Together with the van der Waals
parameters in the QM/MM Hamiltonian, there are 4 QM/MM parameters for each element type,
and they can be determined based on microsolvation clusters.
43
To be consistent with the third-
order formulation of SCC-DFTB,
7
the Hubbard parameter in the KO functional is dependent on
the QM charge. As a result, the effective size of the QM charge distribution naturally adjusts
as the QM region undergoes chemical transformations, making the KO based QM/MM scheme
particularly attractive for describing chemical reactions in the condensed phase.
Our studies of charged solutes and chemical reactions clearly indicate that the KO scheme
is robust and transferable. For the tting set clusters, both the point-charge and KO schemes
have comparable errors (relative to full QM results) in solute-solvent interactions, with the Mean
Unsigned Error, (MUE) of 3.3 and 4.8 kcal/mol, respectively (note that the errors are for total
solute-solvent interactions, which are often >100 kcal/mol, thus the error is typically less than
5%!). However, for 16 stable structures and 24 transition states in the QCRNA database, the MUE
is 4.3 kcal/mol for the KO scheme but 16.2 kcal/mol for the point-charge based QM/MM model.
As another example, for the hydrolysis of phosphate mono esters in solution
43
the hydrolysis
barrier is grossly overestimated ( 11 kcal/mol) with SCC-DFTBPR/MM simulations using the
point-charge based QM-MM Hamiltonian.
3
With the KO scheme, the computed barrier is in close
agreement (within 2 kcal/mol) with available experimental data.
Parametrization
The parametrization of the DFTB models involves three steps:
3
SCC-DFTBPR is a DFTB variant including only diagonal 3rd order terms and a specic modication and
parametrization for phosphate hydrolysis. See Ref.
40
11
(i) The determination of the parameters for E
1
:
This is usually the rst step in the parametrization. Here, one has to compute
H
0

=<

|

H[
0
]|

> (12)
and S

for setting up the Hamilton Matrix elements in Eq.10. In a rst step, one has to determine
the atomic orbital basis set

and the neutral atom densities


0
a
by solving the atomic KS equa-
tions where an additional potential leads to a connement of the orbitals.
9
For the basis set, the
connement parameter is usually set to roughly twice the covalent radius of the element, while the
choice of the connement radius of the initial atomic densities is slightly more empirical.
7,12
The
choice of these two parameters in a reasonable range does not alter molecular properties on a large
scale, however, they can be used for a ne-tuning of the method. This has been discussed recently
for the derivation of the new DFTB3 parameters 3OB.
8
Compared to the older DFTB2 parameters
mio, more diffuse basis functions

lead to an increase in Pauli repulsion which is relevant for


weak interactions, while a slightly larger compression of the initial densities
0
a
leads to a decrease
in the overbinding and therefore better performance for heats of formation and reaction energies.
(ii) The determination of the parameters for E
2
and E
3
:
For the atomic partial charges q
a
in Eq.8 a Mulliken partitioning scheme is usually applied, al-
though other schemes are possible as well. Using CM3 charges has been shown to improve the
electrostatic potential of molecules;
17,45
however, additional parameters would enter the parametriza-
tion procedure, which we have tried to keep as simple and straightforward as possible. The function
(U
a
, R
ab
) in Eq.8 has been determined by an analytical derivation
3
and the chemical hardness pa-
rameter (or Hubbard parameter) U
a
is usually computed from DFT. However, as described above,
this choice of (U
a
, R
ab
) presupposes a particular inverse relationship between the chemical hard-
ness and the size of an atom, which holds well within one row of the periodic table but by no means
for elements of different rows
4,5,7 4
. Therefore, the functional form of (U
a
, R
ab
) should depend
on the row of the peridic table. For the rst row, we use the original form but for hydrogen and
4
See in particular Fig. 2 from Ref. 7
12
its interaction with other elements a modied function
h
(U
a
, R
ab
) is applied.
4,5,7
Other choices
of functions for the 2nd, 3rd etc. rows is ongoing work. For DFTB3, the derivative of (U
a
, R
ab
),
(U
a
,U
d
a
, R
ab
) is needed,
7
where U
d
a
is the charge derivative of U
a
. In the earlier implementa-
tions,
6,40
only the diagonal part of (U
a
,U
d
a
, R
ab
) was implemented. This works well for rst row
molecules, except for the deprotonation of NH
3
, where the off-diagonal terms seem to be impor-
tant.
7
The diagonal version, however, was not able to describe phosphorous containing molecules,
in particular their (de-)protonation energies, and an ad hoc modication has been necessary in-
volving a special parametrization.
40
This problem could be remedied at full third-order,
7
however,
by treating U
d
a
as adjustable parameters.
(iii) The determination of the parameters for E
0
:
The determination of E
rep
has been greatly simplied by introducing automated parametrization
procedures.
12,46
These schemes not only reduce the effort but also allow to vary the optimization
targets. In principle, data from any theoretical level and experiment can enter the parametriza-
tion. Since it depends only on
0
, E
rep
could be determined in principle only once and would
be valid for all DFTB models. Indeed, the E
rep
parameters originally derived for DFTB2, called
mio, worked rather well with DFTB3.
6,7
However, a ne tuning can be achieved when E
rep
is
specically optimized for the respective model. Therefore, we have reoptimized the parameters
for DFTB3, now called 3OB (referring to DFTB3 and the main eld of application: organic and
biological molecules). Therefore, there are currently two sets of parameters available, the mio
set which has been derived for DFTB2
3
and the 3OB set, which has been derived for DFTB3.
8 5
Note that the 3OB set also differs in the electronic parameters, as described above.
6
In summary, one rst has to determine 4 parameters per atomtype, the connement radii for
the atomic orbitals

, which is called r
0
, the connement radii for the atomic densities
a
0
, which
is called r
d
0
, the atomic Hubbard parameter U
a
and its charge derivative U
d
a
. The determination
5
These parameter sets can be downloaded from the website www.dftb.org.
6
In earlier work applying the diagonal DFTB3 method in combination with the mio set, tted U
d
values com-
pensated for the overbinding of the method. This is no longer needed using 3OB since this parametrization removes
the overbinding by changing the density compression radii.
8
Further, the special parametrization and modication for
phosphorous compounds
40
is no longer required due to the introduction of the 3rd order off-diagonal terms.
13
of r
0
and r
d
0
is an empirical procedure and can be quite involved,
8,12
while U
a
and U
d
a
can be
easily computed in principle. For the modied function
h
one additional parameter appears,
which is tted to reproduce the water dimer binding energy. The repulsive potentials are two-body
contributions, therefore they are much more involved although largely automated procedures have
been recently developed.
12
While for many applications relative energies are the important quantity, sometimes the cal-
culation of atomization energies and heats of formation is desired, as for example in the case of
tting the DFTB repulsive potentials. However, the calculation of atomization energies requires
some additional care.
8
It is given by the total energy of a system E
tot
and the atomic energies E
atom
E
At
= E
tot
+

a
E
atom
a
(13)
With a closed-shell treatment DFTB gives E
atom
a
of rather poor quality. One may use the spin-
polarization formalism, which improves the results. In practice, however, the atomization energies
are usually calculated using spin-polarization energies E
spin
that are pre-calculated from DFT for
each atom; i.e., E
spin
is the difference of the atomic energy calculated at the spin-unpolarized state
and the spin-polarized state.
7
With that, E
atom
is calculated as the total energy of an atom plus the
spin-polarization energy E
spin
.
Note that using E
spin
gives slightly more accurate results than using atomic energies as calcu-
lated from spin-polarized DFTB because the spin-polarization from the atom (as calculated from
DFT) is added rather than a correction of the atomic energy. The latter uses spin-polarization con-
stants calculated as derivative of the atomic eigenvalues in the proximity of the spin-unpolarized
atom.
7
In the case of Hydrogen the spin-unpolarized state would refer to a hypothetical one where 0.5 electrons are spin-
up and 0.5 electrons are spin-down, while the spin-polarized state is the ground state of the atom with 1.0 spin-up and
0.0 spin-down electrons.
14
Performance
Energetics, structure and vibrational frequencies of small molecules
DFTB2 has been tested over the years for a variety of molecular properties. A rst thorough test
has been performed by Krger et al,
47
who benchmarked the accuracy of DFTB2 against G2 and
BLYP for 22 molecules, evaluating 28 reaction energies, geometries and vibrational frequencies.
Reaction energies show an mean error of 4.3 kcal/mol with respect to G2 and geometries are in
excellent agreement with those obtained at the DFT level. Vibrational frequencies, however, show
larger deviations; in particular, the stretch frequencies of several specic modes are signicantly
overestimated. Therefore, Maolepsza et al.
48
suggested to apply a specic parametrization of
E
rep
for vibrational frequencies. With this special parameter set, DFTB2 shows a very good per-
formance and vibrational frequencies approach the quality of those from full DFT. We investigated
this point in more detail in later publications for DFTB2
12
and DFTB3.
8
These studies demon-
strate the limited exibility of the current DFTB approach; i.e, it is not possible to achieve an
accuracy comparable to DFT-GGA for both reaction energies and vibrational frequencies with a
single parameterization. There is an optimization conict where one property deteriorates when
the other is improved. The pragmatic solution to this problem is to supply two sets of parame-
ters,
8
one optimized for energies and geometries (3OB), the other for geometries and vibrational
frequencies (3OB-f).
Two other publications have benchmarked DFTB2 for even larger molecular test sets. Sat-
telmeyer et al.
49
benchmarked DFTB2 for 622 closed shell molecules containing O, N, C and H in
comparison with Hartree-Fock based semi-empirical methods like AM1, PM3 and PDDG/PM3.
The good performance of DFTB for geometries was conrmed, however, the performance of
DFTB2 for heats of formation with a mean average error of 5.8 kcal/mol was worse than that
of PM3 and PDDG/PM3, the latter with a mean absolute error of 3.2 kcal/mol that even outper-
forms B3LYP/6-31G(d).
8
Otte et al.
50
conrmed these ndings, showing that DFTB2 performs
8
This study also indicated errors in the treatment of N-O and S-O bonds, which should be ameliorated with the
new 3OB parametrization.
15
slightly worse for heats of formation than AM1 and PM3, and in particular worse than the OMx
suite of methods. However, geometries are very well described and DFTB2 is clearly superior for
vibrational frequencies. Further, DFTB2 performs very well for structures and relative energies of
peptide conformations,
9
as well as for hydrogen bonded systems.
The 3OB parametrization for DFTB3 has been developed with improving two particular lim-
itations of DFTB2 in mind: the overbinding of about 5-10 kcal/mol per covalent bond (for O,
N, C, H containing molecules) and the underestimation of binding energies in weakly bonded
complexes.
8
The third-order terms improve the description of localized charges
10
and the modi-
ed Coulomb interaction
h
(U
a
, R
ab
) improves hydrogen bonding interactions.
7
As a result, the
description of energies is greatly improved: DFTB3/3OB approaches the accuracy of DFT-GGA
methods like PBE for heats of formations and atomization energies as well as the accuracy of the
@@best semi-empirical methods like PDDG-PM3. DFTB3/3OB is even better than DFT-GGA
when only a small, double zeta type basis set is applied,
8
as typically done with DFT or DFT/MM
based molecular dynamics simulations. In particular hydrogen bonding energies, proton afni-
ties and proton transfer barriers, which are relevant in many biochemical problems, are very well
described.
Recently, Goerigk and Grimme have compiled a general database (GMTKN24) for main group
thermochemistry, kinetics and non-covalent interactions.
29
This set benchmarks a variety of molec-
ular properties, reaction and atomization energies, reaction barriers, electron afnities, ionization
potentials (IPs) and proton afnities (EAs), hydrogen bonding and VdW interactions, conforma-
tional energies of peptides, hydrocarbons and carbohydrates, isomerization reactions and some
other properties. For this set, the accuracy of DFTB3/3OB is comparable to the newest variant of
the OMx models,
8
OM3, which has been shown recently to approach the accuracy of DFT-GGA
methods for this data set.
53
9
See also Refs. 51,52 for more detail.
10
As they appear in small charged molecules, where the charge is located on few atoms. Large ionic molecules,
where the charge is distributed over a large number of atoms is unproblematic in DFTB2.
16
Properties: IPs, EAs, dipole moments and molecular polarizability
IPs and EAs for small molecules are difcult to compute with a minimal basis set method like
DFTB since these properties do not enter the parametrization procedure, in contrast to NDDO
type semi-empirical methods. The adjustment of E
rep
only affects bond lengths (not angles!),
bond energies and stretch frequencies. Therefore, properties like IP and EA are usually less accu-
rately described
8,49,50,53
and deserve careful testing for the specic problem in hand. This holds
as well for dipole moments, which are simply computed from the Mulliken population analysis.
The description of electrostatic properties such as dipole moments can be easily improved using a
parametrized charge scheme like CM3, as has been shown in Ref. 17. However, IR intensities are
not changed since these depend on the derivative of the dipole moment with respect to the normal
coordinates, which is not improved. Unfortunately, this holds similarly for molelcular polariz-
abilities and Raman intensities. The polarizabilities can be adequately improved using methods
like Chemical Potential equilization
16
or a variational approach (VAR),
17
but Raman intensities
suffer from the same problem as IR intensities; i.e., although the properties are improved, their
derivatives with respect to normal coordinates are not.
Conformations of complex molecules
Most of the tests described so far benchmark the DFTB performance for covalent bond lengths
and bond angles. The performance for dihedral angles, which is important to the description of
conformations of complex molecules like peptides, proteins, DNA and carbohydrates, remains
systematically tested. DFTB2 has been extensively benchmarked for the structures and relative en-
ergies of polyalanine conformations.
51,52
Relative energies and structures were found to be in good
agreement with DFT and ab initio predictions, and vibrational spectroscopic features were also re-
produced satisfactorily;
54
for a short review, see Ref. 55. However, low frequency modes seem
to be underestimated,
56
which indicates that rotational barriers are too low in DFTB. QM/MM
simulations of di-alanine in water indicated that the free energy minima at the -helical and -
sheet region were more extended than in standard force eld methods,
57
a nding conrmed later
17
using a different QM/MM implementation.
58
A deeper analysis indeed showed that the rotational
barriers around the dihedral angles are very low. Furthermore, DFTB/MM populates the basin
more than the basin, in contrast to experimental ndings. The energy differences, however, are
small and on the order of 0.5 kcal/mol. Therefore, small changes in the Hamiltonian can lead to
a signicant change in the populations, and it is possible that DFTB3 with an improved QM/MM
coupling (KO-scheme, see above) leads to an improvement.
DFTB2 has also been tested for carbohydrates and the property of interest are again the dihe-
dral angles, in particular the ring puckering modes. It has been shown that DFTB2 produces free
energy surfaces for conformational transitions similar to those of ab initio methods, in contrast to
various NDDO methods,
59
motivating the application of DFTB2 to carbohydrate reaction dynam-
ics.
60,61
The agreement with high level methods, however, is far from perfect and leaves ample
room for future improvement. For example, potential energy scans for certain dihedral angles
clearly showed that DFTB2 is in qualitative agreement with full DFT but with too low torsional
barriers, while NDDO type methods seem to fail even qualitatively.
62
Nevertheless, there seems
no compelling reason to use DFTB2 for the description of structure and dynamics of carbohydrates
at the moment, since empirical force elds currently represent the potential energy surface much
more accurately.
Water
Another issue worth mentioning is the description of water by DFTB. Given its fundamental
importance in chemistry and biology, its desirable to be able to adequately describe water in
both gas and condensed phases, including water in different protonation states (e.g., a solvated
proton/hydroxide). With the standard DFTB2, the hydrogen bonding interaction between water
molecules is too weak; as discussed above and in detail elsewhere,
4,5,7
this motivated the de-
velopment of the modied
h
function for atom pairs involving H. With DFTB3
7
and the latest
parameterization,
8
for example, the water dimer binding energy is well described and low-energy
conformers of small water clusters are also captured. The relative energies of these low-energy con-
18
formers, however, are not yet ideal, suggesting the need of further improving hydrogen-bonding
interactions by, for example, going beyond the monopole approximation for charge-charge inter-
actions in eq. 8. The imperfection of water-water interaction is also manifested in bulk water
simulations, which indicated that both DFTB2 and DFTB3 tend to over-predict the height of the
rst solvation shell peak in the O-O radial distribution function while underestimating the second
solvation shell.
6365
For NVT simulations at the ambient condition, one simple but ad hoc approach to improve the
description of bulk water is to adjust the pair-wise repulsive potentials based on a reversed Monte
Carlo protocol such that experimental radial distribution functions are reproduced. This is found
66
to be somewhat successful in that the resulting repulsive potentials also improved the description
of small protonated water clusters and the structure of a solvated proton. For the 13 low-energy
isomers H(H
2
O)
+
22
, for example, the RMSE is only 0.9 kcal/mol relative to MP2 results,
67
as
compared to the value of 3.8 kcal/mol for the original DFTB3. For a solvated proton in the bulk,
the integrated coordination number for the rst solvation shell is 3.2, which is close to the value
of 3.0 for CPMD (using the HCTH functional); by comparison, the standard DFTB3 gives a value
close to 5.0.
65
Nevertheless, the enthalpy of evaporation remains too low by about 1 kcal/mol,
and preliminary NPT simulations indicate that DFTB2/3 tends to substantially overestimate the
density of bulk water at the ambient condition, a situation also observed in some ab initio DFT
simulations.
68
Therefore, improving the description of water remains an important topic for further
DFTB developments.
Conclusions
The extension of DFTB to the third-order, DFTB3,
7,8
in combination with a new parametrization
procedure
12
has improved the performance signicantly for reaction energies, geometeries and
hydrogen bonded complexes. DFTB3 even outperforms DFT-GGA with double zeta (DZ) basis
in special cases, although being 2-3 orders of magnitude faster.
69
However, the computational
19
efciency comes at the price of reduced transferability; i.e., not all molecular properties can be
computed at the same accuracy within one parameter set. Such an optimization conict has been
found in case of reaction energies and vibrational frequencies, therefore we have proposed to use
two different parametrizations, the 3OB for energies and geometries and 3OB-f for geometries and
vibrational frequencies;
8
geometries are described with similar accuracy in both parameter sets.
A key to better non-bonded interactions is the use of the third-order term in combination with
the modied Coulomb interaction term,
h
. The augmentation of the DFTB3 total energy with
the empirical dispersion extension can be advised as a default, because it usually only improves
results.
Despite all these improvements, there are still several limitations of DFTB:
(i) The DFT-GGA framework used for the expansion of the total energy. The DFTB models
inherit the well-known DFT-GGA problems, especially the self-interaction error.
(ii) The use of a minimal basis set. This leads to a reduced molecular polarizability and limits
the application of DFTB for computing IR and Raman spectra. Further, the missing polarization
functions may cause problems in the description of sp
3
nitrogen.
8
This shows up in large errors
for proton afnities with acidic nitrogen, for which no satisfactory solution has been proposed up
to now; an ad hoc x is used by applying a special parameter set (NHmod) for these special cases.
(iii) The limited exibility of the scheme (xed initial density, monopole approximation) leaves
further problems. This shows up in the description of atomization energies of ionic species,
8
an-
other complication is the need for two different parameter sets for hydrogen. Aspecial parametriza-
tion is needed, when the bond breaking of molecular hydrogen is computed.
8
(iv) DFTB describes the general conformational properties of biomolecules quite well: pep-
tides, DNA bases and sugars can be computed with often good accuracy. However, DFTB under-
estimates torsional barriers, which currently limits its applicability in the description of conforma-
tional dynamics of these complex molecules.
Another important direction for DFTB development concerns the treatment of metal ions, such
as Mg
2+
, Zn
2+
and Cu
+/2+
, which play important structural and catalytic roles in biomolecules.
20
DFTB2 has been parameterized for several rst-row elements (e.g., Fe, Ni, Co, Cu and Zn),
7072
and it has been shown that DFTB2 generally gives reliable structural properties for metal sites,
including fairly complex bi-metallo zinc sites,
7378
and DFTB/MM has been successfully applied
to a number of metalloenzymes by us and other research groups.
7379
Pushing forward the DFTB
framework is signicant for metalloenzyme applications because for transition metal ions, despite
progress,
80,81
a robust semi-empirical method (even just for structures!) is not yet available. This
is particularly true for open-shell cases: although parameterizations for several open-shell metal
ions (e.g., Ni, Cu and Fe) have been reported in the literature,
71,72,82
their application has largely
been limited to geometry optimization of organometallic compounds and only several metalloen-
zymes;
83
systematic development of the methodology to improve energetics remains an important
frontier.
References
(1) Porezag, D.; Frauenheim, T.; Khler, T.; Seifert, G.; Kaschner, R. Phys. Rev. B 1995, 51,
1294712957.
(2) Seifert, G.; Porezag, D.; Frauenheim, T. Int. J. Quantum Chem. 1996, 58, 185192.
(3) Elstner, M.; Porezag, D.; Jungnickel, G.; Elsner, J.; Haugk, M.; Frauenheim, T.; Suhai, S.;
Seifert, G. Phys. Rev. B 1998, 58, 72607268.
(4) Elstner, M. Theor. Chem. Acc. 2006, 116, 316325.
(5) Elstner, M. J. Phys. Chem. A 2007, 111, 56145621.
(6) Yang, Y.; Yu, H.; York, D.; Cui, Q.; Elstner, M. J. Phys. Chem. A 2007, 111, 1086110873.
(7) Gaus, M.; Cui, Q.; Elstner, M. J. Chem. Theory Comput. 2011, 7, 931948.
(8) Gaus, M.; Goez, A.; Elstner, M. J. Chem. Theory Comput. 2012, 9, 338.
(9) Seifert, G.; Joswig, J.-O. WIREs Comput Mol Sci 2012, 2, 456465.
21
(10) Koskinen, P.; Makinen, V. Comp. Mat. Sci. 2009, 47, 237.
(11) Kohn, W.; Sham, L. J. Phys. Rev. 1965, 140, A1133A1138.
(12) Gaus, M.; Chou, C.-P.; Witek, H.; Elstner, M. J. Phys. Chem. A 2009, 113, 1186611881.
(13) Goldman, N.; Fried, L. E. J. Phys. Chem. C 2011, 116, 219822044.
(14) Perdew, J. P.; Burke, K.; Ernzerhof, M. Phys. Rev. Lett. 1996, 77, 38653868.
(15) Seifert, G. J. Phys. Chem. A 2007, 111, 56095613.
(16) Kaminski, S.; Giese, T. J.; Gaus, M.; York, D. M.; Elstner, M. J. Phys. Chem. A 2012, 116,
91319141.
(17) Kaminski, S.; Gaus, M.; Elstner, M. J. Phys. Chem. A 2012, 116, 1192711937.
(18) Giese, T. J.; York, D. M. Theor. Chem. Acc. 2012, 131, 1145.
(19) Elstner, M.; Hobza, P.; Frauenheim, T.; Suhai, S.; Kaxiras, E. J. Chem. Phys. 2001, 114,
51495155.
(20) Wu, Q.; Yang, W. J. Chem. Phys. 2002, 116, 515524.
(21) Grimme, S. J. Comput. Chem. 2004, 25, 14631473.
(22) Grimme, S. J. Comput. Chem. 2006, 27, 17871799.
(23) Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. J. Chem. Phys. 2010, 132, 154104.
(24) Zhechkov, L.; Heine, T.; Patchkovski, S.; Seifert, G.; Duarte, H. J. Chem. Theory Comput.
2005, 1, 841847.
(25) Risthaus, T.; Grimme, S. J. Chem. Theory Comput., in press.
(26) Khler, C.; Seifert, G.; Gerstmann, U.; Elstner, M.; Overhof, H.; Frauenheim, T. Phys. Chem.
Chem. Phys. 2001, 3, 51095114.
22
(27) Khler, C.; Seifert, G.; Frauenheim, T. Chem. Phys. 2005, 309, 2331.
(28) Khler, C.; Frauenheim, T.; Hourahine, B.; Seifert, G.; Sternberg, M. J. Phys. Chem. A 2007,
111, 56225629.
(29) Goerigk, L.; Grimme, S. J. Chem. Theory Comput. 2010, 6, 107126.
(30) Hourahine, B.; Sanna, S.; Aradi, B.; Khler, C.; Niehaus, T.; Frauenheim, T. J. Phys. Chem.
A 2007, 111, 56715677.
(31) Lundberg, M.; Nishimoto, Y.; Irle, S. Int. J. Quantum Chem. 2012, 112, 17011711.
(32) Rapacioli, M.; Spiegelman, F.; Scemama, A.; Mirtschink, A. J. Chem. Theory Comput. 2011,
7, 4455.
(33) Cui, Q.; Elstner, M.; Kaxiras, E.; Frauenheim, T.; Karplus, M. J. Phys. Chem. B 2001, 105,
569585.
(34) Riccardi, D.; Schaefer, P.; Yang, Y.; Yu, H.; Ghosh, N.; Prat-Resina, X.; Knig, P.; Li, G.;
Xu, D.; Guo, H.; Elstner, M.; Cui, Q. J. Phys. Chem. B 2006, 110, 64586469.
(35) Yang, Y.; Yu, H.; Cui, Q. J. Mol. Biol. 2008, 381, 14071420.
(36) Ghosh, N.; Xavier, P.-R.; Gunner, M. R.; Cui, Q. Biochemistry 2009, 48, 24682485.
(37) Freindorf, M.; Gao, J. L. J. Comp. Chem. 1996, 17, 386395.
(38) Riccardi, D.; Li, G.; Cui, Q. J. Phys. Chem. B 2004, 108, 64676478.
(39) Giese, T. J.; York, D. M. J. Chem. Phys. 2007, 127, 194101.
(40) Yang, Y.; Yu, H.; York, D.; Elstner, M.; Cui, Q. J. Chem. Theory Comput. 2008, 4, 2067
2084.
(41) Nielsen, S. O.; Bulo, R. E.; Moore, P. B.; Ensing, B. Phys. Chem. Chem. Phys. 2010, 12,
1240112414.
23
(42) Park, K.; Gtz, A. W.; Walker, R. C.; Paesani, F. J. Chem. Theo. Comp. 2012, 8, 28682877.
(43) Hou, G.; Zhu, X.; M. Elstner,; Cui, Q. J. Chem. Theo. Comp. 2012, 8, 42934304.
(44) Pople, J. A.; Beveridge, D. L. Approximate Molecular Orbital Theory; McGraw-Hill Com-
panies, 1970.
(45) Kalinowski, J. A.; Lesyng, B.; Thompson, J. D.; Cramer, C. J.; Truhlar, D. G. J. Phys. Chem.
A 2004, 108, 25452549.
(46) Bodrog, Z.; Aradi, B.; Frauenheim, T. J. Chem. Theory Comput. 2011, 7, 26542664.
(47) Krger, T.; Elstner, M.; Schiffels, P.; Frauenheim, T. J. Chem. Phys. 2005, 122, 114110.
(48) Maolepsza, E.; Witek, H. A.; Morokuma, K. Chem. Phys. Lett. 2005, 412, 237243.
(49) Sattelmeyer, K. W.; Tirado-Rives, J.; Jorgensen, W. L. J. Phys. Chem. A 2006, 110, 13551
13559.
(50) Otte, N.; Scholten, M.; Thiel, W. J. Phys. Chem. A 2007, 111, 57515755.
(51) Elstner, M.; Jalkanen, K. J.; Knapp-Mohammady, M.; Frauenheim, T.; Suhai, S. Chem. Phys.
2001, 263, 203219.
(52) Elstner, M.; Jalkanen, K. J.; Knapp-Mohammady, M.; Frauenheim, T.; Suhai, S. Chem. Phys.
2000, 256, 1527.
(53) Korth, M.; Thiel, W. J. Chem. Theory Comput. 2011, 7, 29292936.
(54) Bohr, H. G.; Jalkanen, K. J.; Elstner, M.; Frimand, K.; Suhai, S. Chem. Phys. 1999, 246,
1336.
(55) Elstner, M.; Frauenheim, T.; Suhai, S. J. Mol. Struct.: THEOCHEM 2003, 632, 2941.
(56) Elstner, M.; Frauenheim, T.; Kaxiras, E.; Seifert, G.; Suhai, S. Phys. Stat. Sol. B 2000, 217,
357376.
24
(57) Hu, H.; Elstner, M.; Hermans, J. Proteins: Struct., Funct., Genet. 2003, 50, 451463.
(58) Seabra, G. D. M.; Walker, R. C.; Elstner, M.; Case, D. A.; Roitberg, A. E. J. Phys. Chem. A
2007, 111, 56555664.
(59) Barnett, C.; Naidoo, K. J.Phys. Chem. B 2010, 114, 1714217154.
(60) Barnett, C.; Wilkinson, K.; Naidoo, K. J. Am. Chem. Soc. 2010, 132, 1280012803.
(61) Barnett, C.; Wilkinson, K.; Naidoo, K. J. Am. Chem. Soc. 2011, 133, 1947419482.
(62) Islam, S.; Roy, P.-N. J. Chem. Theory Comput. 2012, 8, 24122423.
(63) Hu, H.; Lu, Z.; Elstner, M.; Hermans, J.; Yang, W. J. Phys. Chem. A 2007, 111, 56855691.
(64) Maupin, C.; Aradi, B.; Voth, G. J. Phys. Chem. B 2010, 114, 69226931.
(65) Goyal, P.; M. Elstner,; Cui, Q. J. Phys. Chem. B 2011, 115, 67906805.
(66) Goyal, P.; Hu, J.; M. Elstner,; Irle, S.; Cui, Q. Manuscript in preparation
(67) Choi, T.; Jordan, K. J. Phys. Chem. B 2010, 114, 69326936.
(68) Wang, J.; Roman-Perez, G.; Soler, J. M.; Artacho, E.; Fernandez-Serra, M. V. J. Chem. Phys.
2011, 134, 024516.
(69) Elstner, M.; Gaus, M. In Computational Methods for Large systems: Electronic Structure
Approaches for Biotechnology and Nanotechnology; Reimers, J. R., Ed.; John Wiley and
Sons: Hoboken, New Jersey, 2011; pp 287308.
(70) Elstner, M.; Cui, Q.; Munih, P.; Kaxiras, E.; Frauenheim, T.; Karplus, M. J. Comput. Chem.
2003, 24, 565581.
(71) Zheng, G. S.; Witek, H. A.; Bobadova-Parvanova, P.; Irle, S.; Musaev, D. G.; Prabhakar, R.;
Morokuma, K. J. Chem. Theo. Comp. 2007, 3, 13491367.
25
(72) Bruschi, M.; Bertini, L.; Bonacic-Koutecky, V.; De Gioia, L.; Mitric, R.; Zampella, G.; Fan-
tucci, P. J. Phys. Chem. B 2012, 116, 62506260.
(73) Hou, G. H.; Cui, Q. J. Am. Chem. Soc. 2012, 134, 229246.
(74) Riccardi, D.; Yang, S.; Cui, Q. Biochim. Biophys. Acta 2010, 1804, 342351.
(75) Xu, D.; Guo, H.; Cui, G. J. Am. Chem. Soc. 2007, 129, 1081410822.
(76) Xu, D. G.; Guo, H. J. Am. Chem. Soc. 2009, 131, 97809788.
(77) Xu, D. G.; Xie, D. Q.; Guo, H. J. Biol. Chem. 2006, 281, 87408747.
(78) Chakravorty, D. K.; Wang, B.; Lee, C. W.; Giedroc, D. P.; K. M. Merz, Jr., J. Am. Chem. Soc.
2012, 134, 33673376.
(79) Yang, Y.; Miao, Y. P.; Wang, B.; Cui, G. L.; K. M. Merz, Jr., Biochem. 2012, 51, 26062618.
(80) Thiel, W. Adv. Chem. Phys. 1996, 93, 703757.
(81) Thiel, W.; Voityuk, A. A. J. Phys. Chem. 1996, 100, 616626.
(82) Khler, C.; Seifert, G.; Frauenheim, T. Chem. Phys. 2005, 309, 2331.
(83) Lundberg, M.; Sasakura, Y.; Zheng, G. S.; Morokuma, K. J. Chem. Theo. Comp. 2010, 6,
14131427.
26