Sie sind auf Seite 1von 5

SUPPORTING INFORMATION

Computation of Octanol-Water Partition Coefficients by Guiding an Additive Model


with Knowledge

Tiejun Cheng, Yuan Zhao, Xun Li, Fu Lin, Yong Xu, Xinglong Zhang, Yan Li and Renxiao
Wang*

State Key Laboratory of Bioorganic Chemistry, Shanghai Institute of Organic Chemistry,


Chinese Academy of Sciences, Shanghai, P. R. China

Luhua Lai

State Key Laboratory of Structural Chemistry of Stable and Unstable Species, College of
Chemistry, Peking University, Beijing, P. R. China

Atom/group types and correction factors defined in XLOGP3

Atom/group Types. A total of 83 basic atom types are implemented in XLOGP3 to


classify carbon, nitrogen, oxygen, sulfur, phosphorus, and halogen atoms (Table S1). The
classification of a given atom is made by considering (i) its element type, (ii) its hybridization
state, (iii) its accessibility to solvent, characterized by the number of attached hydrogen atoms
on this atom, (iv) the nature of its direct neighboring atoms, (v) whether it is connected to a
conjugated system with π electrons, and (vi) whether it is in a ring. Note that in both
XLOGP2 and XLOGP3, atom types are defined to be “united” atoms, which include
hydrogen atoms implicitly. No additional atom types are thus necessary for classifying
hydrogen atoms.
This classification scheme is augmented by four additional terminal groups, i.e. types
84~87 in Table S1. These groups are combinations of some very unique types of atoms. It is
technically more efficient to treat them as integrated pieces rather than to break them into
individual atoms.

Table S1. Atom/Group Types and Corrections Factors Defined in XLOGP3


Relevant Total
ID Symbol Contrib.c Description d
Compd.a Occur.b
sp3 carbon
1 C.3.3h.lipo e 642 719 0.7896 lipophilic C*H3R
2 C.3.3h.X.pi 1027 1111 -0.0753 C*H3X connected to a conjugated moiety
3 C.3.3h.X 2090 3200 0.0402 C*H3X
4 C.3.3h.pi 1030 1320 0.5018 C*H3R connected to a conjugated moiety
5 C.3.3h 2168 4447 0.5240 C*H3R
6 C.3.2h.lipo e 559 1224 0.5201 lipophilic C*H2R2
7 C.3.2h.X.pi 1898 2293 -0.2441 C*H2R2-nXn (n>0) connected to a

1
conjugated moiety
8 C.3.2h.X 2592 4993 -0.0821 C*H2R2-nXn (n>0)
9 C.3.2h.pi 1005 1139 0.2718 C*H2R2 connected to a conjugated moiety
10 C.3.2h 1898 3994 0.3436 C*H2R2
11 C.3.h.X.pi 1105 1603 -0.3711 C*HR3-nXn (n>0) connected to a
conjugated moiety
12 C.3.h.X 1162 2077 -0.1426 C*HR3-nXn (n>0)
13 C.3.h.pi 249 291 0.0841 C*HR3 connected to a conjugated moiety
14 C.3.h 584 988 0.1485 C*HR3
15 C.3.X.pi 596 634 -0.5475 C*R4-nXn (n>0) connected to a conjugated
moiety
16 C.3.X 386 472 -0.4447 C*R4-nXn (n>0)
17 C.3.pi 182 185 0.0885 C*R4 connected to a conjugated moiety
18 C.3 179 206 0.0596 C*R4
aromatic carbon
19 C.ar.h.X 1555 2320 -0.1039 R≈C*(-H) ≈X or X≈C*(-H) ≈X
20 C.ar.h 5962 26613 0.3157 R≈C*(-H) ≈R
21 C.ar.ar 465 764 0.3158 A≈C*(≈A) ≈A
22 C.ar.(-X).X 1321 1980 -0.1003 R≈C*(-X) ≈X or X≈C*(-X) ≈X
23 C.ar.(-X) 5009 10899 -0.0112 R≈C*(-X) ≈R
24 C.ar.X 726 888 -0.1874 R≈C*(-R) ≈X or X≈C*(-R) ≈X
25 C.ar 2702 3786 0.1911 R≈C*(-R) ≈R
sp2 carbon
26 C.2.2h 133 150 0.5977 H-C*(=A)-H
27 C.2.h.(=C).X 587 830 -0.0967 H-C*(=C)-X
28 C.2.h.(=C).ring 224 355 0.4004 H-C*(=C)-R in a ring
29 C.2.h.(=C) 339 409 0.3214 H-C*(=C)-R
30 C.2.h.(=X) 282 284 -0.8756 H-C*(=X)-A
31 C.2.(=C).X 560 826 -0.2069 R-C*(=C)-X or X-C*(=C)-X
32 C.2.(=C).ring 224 246 -0.2084 R-C*(=C)-R in a ring
33 C.2.(=C) 30 33 0.4840 R-C*(=C)-R
34 C.2.(=X).X 4446 6552 -0.8076 R-C*(=X)-X or X-C*(=X)-X
35 C.2.(=X).ring 413 539 -0.5304 R-C*(=X)-R in a ring
36 C.2.(=X) 384 405 -0.6093 R-C*(=X)-R
sp carbon
37 C.1.== 30 31 -0.5879 A=C*=A
38 C.1 40 88 0.1945 A-C*≡C*-A
sp3 nitrogen
39 N.am.2h 750 806 -0.6414 A-N*H2 in an amide group
40 N.3.2h.pi 645 833 -0.3637 A-N*H2 connected to a conjugated
moiety
41 N.3.2h 437 444 -0.7445 A-N*H2
42 N.am.h 2266 2873 -0.3333 A2-N*H in an amide group
43 N.3.h.pi 425 474 0.2172 A2-N*H connected to a conjugated

2
moiety
44 N.3.h 228 235 -0.2610 A2-N*H
45 N.am 1106 1237 -0.1551 A3N* in an amide group
46 N.3.pi 435 539 0.3776 A3N* connected to a conjugated moiety
47 N.3 402 420 0.1799 A3N*
aromatic nitrogen
48 N.ar.X2 106 139 -0.2167 X≈N*≈X in a 6-member ring
49 N.ar.X 551 774 -0.2974 R≈N*≈X in a 6-member ring
50 N.ar 1476 2205 0.0888 R≈N*≈R in a 6-member ring
51 N.ar.h.X 68 69 0.3675 A≈N*(-H) ≈X in a 5-member ring
52 N.ar.h 253 256 0.2364 R≈N*(-H) ≈R in a 5-member ring
53 N.ar.X2.(2) 94 94 1.1022 X≈N*≈X in a 5-member ring
54 N.ar.X.(2) 228 229 0.4854 R≈N*≈X in a 5-member ring
55 N.ar.(2) 392 395 0.3181 R≈N*≈R in a 5-member ring
sp2 nitrogen
56 N.2.h 49 54 0.6927 H-N*=A
57 N.2.(=C).ring 446 492 0.7974 A-N*(=C) in a ring
58 N.2.(=C) 329 335 0.9794 A-N*(=C)
59 N.2.(=X) 268 404 0.2698 A-N*(=X)
sp3 oxygen
60 O.3.h.pi 1258 1501 -0.0381 A-O*H connected to a conjugated moiety
61 O.3.h 986 1531 -0.4802 A-O*H
62 O.ar 290 296 0.5238 Oxygen atom in an aromatic ring
63 O.3.pi 2535 3566 0.2701 A-O*-A connected to a conjugated
moiety
64 O.3 596 847 0.0059 A-O*-A
sp2 oxygen
65 O.2.(=C) 4555 6809 0.7148 -C=O*
66 O.2.(=X) 1656 3415 -0.5411 -X=O*
sp3 sulfur
67 S.3.h 14 14 0.4927 A-S*H
68 S.ar 187 188 1.1715 Sulfur atom in an aromatic ring
69 S.3.X 106 119 0.4125 R-S*-X or X-S*-X
70 S.3 469 523 0.8300 A-S*-A
sp2 sulfur
71 S.2.(=C) 138 142 1.3544 -C=S*
72 S.2.(=X) 64 66 1.2218 -X=S*
Sulfoxide
73 S.o 39 39 0.0525 A-S*(=O)-A
Sulfone
74 S.o2 546 620 0.5729 A-S*(=O)2-A
Phosphorus
75 P.3 188 190 -0.6694 A3-P*(=A)
Fluorine

3
76 F.pi 266 321 0.4401 A-F* connected to a conjugated moiety
77 F 421 1101 0.5360 A-F*
Chlorine
78 Cl.pi 1127 2096 0.9610 A-Cl* connected to a conjugated moiety
79 Cl 262 595 0.8036 A-Cl*
Bromine
80 Br.pi 254 308 1.0295 A-Br* connected to a conjugated moiety
81 Br 54 74 0.9664 A-Br*
Iodine
82 I.pi 101 128 0.7801 A-I* connected to a conjugated moiety
83 I 14 17 0.9071 A-I*
Terminal groups
84 -C#N 333 788 0.0337 cyano group
85 -N=N=N 23 48 0.5339 diazo group
86 -NO2 821 904 1.2442 nitro group
87 >[N+]-O- 56 71 -1.2147 nitro oxide group
Correction factors
88 AA 214 215 -2.4431 Amino acid in zwitterionic form
89 HB 581 609 0.6123 Internal hydrogen bond
a
Total number of compounds in the training set containing this atom type.
b
Total occurrence of this atom type in the training set.
c
Regression coefficient of this atom type.
d
The following symbols are used here: (-) single bond, (=) double bond, (#) triple bond, (≈) aromatic bond,
(R) group linked through a carbon atom, (X) group linked through a hetero-atom, (A) group linked
through any atom. The asterisk indicates the relevant atom.
e
This atom should be separated from any unsaturated carbon atom or hetero-atom by at lease three single
bonds.

Correction Factors. Two corrections factors are defined in XLOGP3. The first
correction factor accounts for internal hydrogen bonding, which makes a given molecule less
hydrophilic than what is prompted by its chemical structure. In our study, an internal
hydrogen bond will be considered if it meets the following requirements:
(1) The donor must be a sp3 hybridized oxygen atom or a nitrogen atom with at least one
hydrogen atom; while acceptor must be a sp2 hybridized oxygen atom or a sp3 oxygen
atom in a hydroxyl group.
(2) The donor atom and the acceptor atom are separated by four consecutive covalent
bonds. In other words, the formation of such a hydrogen bond should result in a
six-member ring in the given chemical structure (Figure S1).
(3) This hydrogen bond must be immobilized on either rings or conjugated unsaturated
systems to have a rigid structure (Figure S1).
The last requirement needs to be explained further. If a given molecule is flexible in
conformation, some internal hydrogen bonds may form and dissociate in a dynamic manner.
Extensive conformational samplings are necessary in order to detect such hydrogen bonds,
which is apparently impractical for a fast algorithm like XLOGP3 which relies merely on

4
topological structures as inputs. In addition, such hydrogen bonds may also be energetically
less stable and consequently contribute less to logP. As a simplification, such hydrogen bonds
are neglected in XLOGP3.

Figure S1. Examples of internal hydrogen bonds considered by XLOGP3

The second correction factor is used on organic compounds containing an amino acid
moiety. Such a compound exists primarily in zwitterionic form instead of neutral form under
neutral pH condition. Consequently, the octanol-water partition coefficient in such a scenario
is logD rather than logP by definition, where the former is significantly lower than the later
(lower than two units according to our results). A correction factor is thus necessary to
compensate this. In XLOGP3, a given molecule will be detected to be an amino acid if a
carboxyl group and a sp3 hybridized aliphatic nitrogen atom exist simultaneously. The
nitrogen atom, however, must not connect directly with any electron withdraw group or
conjugated moiety to be a strong Lewis base.
Use of this correction factor is of course a very crude treatment on ionizable compounds,
and it is only applicable to amino acids. In the future, we may consider a more robust method
for predicting pKa values to compute logP and logD values for a wider range of organic
compounds.

Das könnte Ihnen auch gefallen