Beruflich Dokumente
Kultur Dokumente
FIG. 1. Schematic view of the bag of bonds representation. (a) Shows the three-dimensional structure of ethanol (CH3 CH2 OH)
and (b) specifies the involved nuclear charges for each Coulomb matrix element. In (c) the different Coulomb matrix entries
which are present for ethanol are sorted into bags and the Bag of Bonds vector (d) is obtained by concatenating these bags
and adding zeros to allow for dealing with other molecules with larger bags.
C–O AND C–N PAIRWISE POTENTIALS mance of both models on the 134k dataset from Ref. [2]
in Figure 3. For the largest training dataset of N =
Figure 2 shows the polynomial pairwise potentials cor- 40000 molecules, the BoB model yields an out-of-sample
responding to Eq.(1) in the paper, obtained for C–O and accuracy of 2 kcal/mol, compared to 4 kcal/mol for the
C–N interactions. Coulomb matrix model.
0
4
Degree 6
−150 −12
2.3 2.5 2.7 1
CO2
−200
0
4
Degree 6
Degree 10
−50
Degree 18 3
CN1 Lennard-Jones
−100 0
2
−10
CN2
−150
−20
2.3 2.5 2.7 1
−200
CN3
1.0 1.5 2.0 2.5 3.0 3.5
Distance [Å]
FIG. 2. Polynomial potentials for C–O (top) and C–N (bottom) interaction: The normalized gray histogramm refers to the
distribution of C–C distances within the GDB-7 dataset and is associated with the right-hand axis. The red dots represent the
energies of the C–C single, double and triple bond as given by fits to experimental bond energies. In blue, polynomial two-body
potentials (as trained in cross validation) are shown.
4
CM BOB
14
U0
U
12 H
G
10
MAE [kcal/mol]
8
0
1k 10k 1k 10k
N N
FIG. 3. Comparison of the performance of the BoB and Coulomb matrix (CM) models on the 134k dataset of equilibrium
molecular geometries from Ref. [2].