Sie sind auf Seite 1von 290

GRAPH THEORETICAL APPROACHES TO CHEMICAL REACTIVITY

Understanding Chemical Reactivity


Volume 9

Series Editor

Paul G. Mezey, University of Saskatchewan, Saskatoon, Canada

Editorial Advisory Board

R. Stephen Berry, University of Chicago, IL, USA


John I. Brauman, Stanford University, CA, USA
A. Welford Castleman, Jr., Pennsylvania State University, PA, USA
Enrico Clementi, IBM Corporation, Kingston, NY, USA
Stephen R. Langhoff, NASA Ames Research Center, Moffet Field, CA, USA
K. Morokuma, Institute for Molecular Science, Okazaki, Japan
Peter J. Rossky, University of Texas at Austin, TX, USA
Zdenek Slanina, Czechoslovak Academy of Sciences, Prague, Czechoslovakia
Donald G. Truhlar, University of Minnesota, Minneapolis, MN, USA
Ivar Ugi, Technische Universitat, Munchen, Germany

The titles published in this series are listed at the end of this volume.
Graph Theoretical Approaches
to
Chemical Reactivity
edited by

Danail Bonchev
and
Ovanes Mekenyan
Higher Institute of Chemical Technology,
Burgas, Bulgaria

SPRINGER SCIENCE+BUSINESS MEDIA, BV.


Library of Congress Cataloging-in-Publication Data

Graph thearetical appraaches ta chemical react,v,ty edited by Danai I


Banchev and Ovanes Mekenyan.
p. cm. -- (Understand,ng chemical reactiv,ty ; v. 91
Includes bibl iagraphical references and ,ndex.
ISBN 978-94-010-4526-1 ISBN 978-94-011-1202-4 (eBook)
DOI 10.1007/978-94-011-1202-4
1. Reactivity (Chemistryl 2. Graph theary. 1. Banchev. Danai 1.
II. Mekenyan. Ovanes. III. Series.
OD505.5.G73 1994
541.3' 94--dc20 94-14280

ISBN 978-94-010-4526-1

Printed on acid-free paper

AII Rights Reserved


© 1994 Springer Science+Business Media Dordrecht
Originally published by Kluwer Academic Publishers in 1994
Softcover reprint of the hardcover 1st edition 1994
No part of the material protected by this copyright notice may be reproduced or
utilized in any form or by any means, electronic or mechanical,
including photocopying, recording or by any information storage and
retrieval system, without written permission from the copyright owner.
TABLE OF CONfENTS

1. INTRODUCTION TO GRAPH THEORY


Haruo Hosoya ..... 1
1. Chemical Graph Theory ..... 1
2. Representation and Characterization of a Graph ..... 3
3. Realization of a Graph .... 10
4. Operations on Graphs .... 23
5. References .... 32
2. THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR
ORBITAL THEORY
Nenad Trinajstic, Zlatko Mihalic, and Ante Graovac .... 37
1. Introduction .... 37
2. Fundamentals of Graph Theory .... 38
3. Isomorphism of Graph Spectral Theory and Huckel Molecular
Orbital Theory ... .42
4. Huckel Spectrum ... .45
5. Topological Effect on Molecular Orbitals .... 46
6. The HOMO-LUMO Separation .... 50
7. Topological Charge Stabilization .... 56
8. Localization Energy .... 63
9. Concluding Remarks .... 67
10. References .... 68
3. TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY:
A COMPARISON OF I'l-SCALED HUCKEL THEORY AND RESTRICTED
HARTREE-FOCK THEORY FOR BORANES AND CARBORANES
Roger Rousseau and Stephen Lee .... 73
1. Introduction .... 73
2. Calculational Method .... 74
3. The Method of Moments .... 77
4. Elemental Boron .... 78
5. BaHt Clusters .... 86
6. The a-Parameter of BlOH lO2 o 89 ....

7. Reaction Pathways .... 97


8. Conclusion ... 105
9. References ... 106

4. POLYHEDRAL DYNAMICS
Robert B. King ... 109
1. Introduction ... 109
2. The Topology of Polyhedra ... 111
3. Polyhedral Isomerizations ... 116
4. Microscopic Models: Diamond-Square-Diamond Processes and Gale
Diagrams 00.116
5. Macroscopic Models: Topological Representations ... 126
6. Literature References ... 134

5. REACTION GRAPHS
Alexandru T. Balaban ... 13 7
vi TABLE OF CONTENTS

1. Introduction 138
2. Reaction Graphs of Rearrangements Via Carbocations ... 138
3. Automerization of Bulvalene, Other Valence Isomers of Annulenes,
and Azabullvalene ... 155
4. Rotation in Molecular Propellers ... 158
5. Reaction Graphs for Rearrangements of Metallic Complexes ... 159
6. Xenon Hexafluoride ... 175
7. Heptaphosphide Trianion ... 175
8. Kinetic Graphs, Synthon Graphs, and Graph Transforms ... 176
9. Conclusions ... 177
10. References ... 177

6. DISCRETE REPRESENTATIONS OF THREE-DIMENSIONAL


MOLECULAR BODIES AND TIIEIR SHAPE CHANGES IN
CHEMICAL REACTIONS
Paul G. Mezey ... 181
1. Introduction and Review of Basic Topological Concepts of Molecular
Shape Representation ... 181
2. Molecular Shape Representation by Nuclear Potential Contours
(NUPCO's) ... 193
3. Topological Patterns of NUPCO Sequences ... 195
4. Shape Changes of NUPCO Sequences Along Reaction Paths and in
Conformational Domains ... 199
5. Shape Changes of NUPCO's in Conformational Changes and in
Molecular Deformations; NUPCO Shape Invariance Domains of the
Configuration Space ... 201
6. Local Shape Invariance of NUPCO's and the Transfer of Functional
Groups in Chemical Reactions ... 203
7. Summary ... 208
8. References ... 208

7. THE INVARIANCE OF MOLECULAR TOPOLOGY IN CHEMICAL


REACTIONS
Eugeny V. Babaev ... 209
1. Introduction ... 209
2. From a Lewis Diagram to the Pseudo-Graph and Graphoid ... 210
3. From Graph (Graphoid) to Surface ... 212
4. What Is the Topological Homeomorphism from the Chemical Point
of View? ... 212
5. The Invariance of the Euler Characteristic in Chemical Reactions ... 215
6. The Main Theorem ... 215
7. Conclusion ... 219
8. References and Notes ... 219

8. TOPOLOGICAL INDICES AND CHEMICAL REACTIVITY


Ovanes Mekenyan and Subhash C. Basak ... 221
1. Introduction ... 221
2. Basic Principles Underlying the Topological Nature of Chemical
Reactivity ... 221
3. Molecular Topology and Topological Invariants ... 223
4. Applications of Topological Indices to Chemical Reactivity ... 232
5. Conclusions ... 236
6. References ... 237
TABLE OF CONTENTS vii

9. GRAPH-THEORETICAL MODELS OF COMPLEX REACTION


MECHANISMS AND THEIR ELEMENTARY STEPS
Oleg N. Temkin, Andrey V. Zeigamik, and Danai! Bonchev ... 241
1. Introduction ... 241
2. Graph-Theoretical Approach to Studies in the Elementary Steps
of Complex Reactions ... 242
3. Classification and Coding of Linear Reaction Mechanisms By Using
Kinetic Graphs ... 252
4. Application of Bipartite Graphs and Stoichiometric Matrices to the
Description of Linear and Nonlinear Reaction Mechanisms ... 261
5. Topological Aspects of Complex Reaction Mechanisms ... 269
6. References ... 273

INDEX ... 277


PREFACE

The progress in computer technology during the last 10-15 years has enabled the
performance of ever more precise quantum mechanical calculations related to structure
and interactions of chemical compounds. However, the qualitative models relating
electronic structure to molecular geometry have not progressed at the same pace. There
is a continuing need in chemistry for simple concepts and qualitatively clear pictures that
are also quantitatively comparable to ab initio quantum chemical calculations.
Topological methods and, more specifically, graph theory as a fixed-point topology,
provide in principle a chance to fill this gap.
With its more than 100 years of applications to chemistry, graph theory has proven to
be of vital importance as the most natural language of chemistry. The explosive
development of chemical graph theory during the last 20 years has increasingly
overlapped with quantum chemistry. Besides contributing to the solution of various
problems in theoretical chemistry, this development indicates that topology is an
underlying principle that explains the success of quantum mechanics and goes beyond it,
thus promising to bear more fruit in the future.
As a part of the series "Understanding Chemical Reactivity", this volume is designed
to introduce the reader to the graph-theoretical and, more generally, topological
elucidation of chemical reactivity. The nine chapters of the volume are written by 15
authors from seven countries who have contributed largely to the development of this
area of science. This emphasizes the importance and complexity of chemical reactivity
studies whose elucidation requires the broad cooperation of scientists from allover the
world, as well as from various branches of chemistry. The chapters are well illustrated
and provide an extensive reference to the problems discussed, in line with the scope of
not teaching but intriguing and guiding the reader.
The introductory chapter on graph theory by H. Hosoya is not just a collection of
terms, definitions, and formulae. The basic notions and concepts of graph theory are
specifically conveyed in a way that benefits from the numerous personal contributions of
the author in this area. Emphasis is put on the matrix and polynomial representations,
symmetry, isomorphism, and operations on graphs. Indeed, a single chapter could not
cover all aspects of the very rich graph-theoretical formalism, and the reader may find
additional information on the subject in the introductory sections of the other chapters.
It is traditional to connect chemical reactivity to graph theory via Hiickel molecular
orbital theory (HMO), which provided the first reactivity indices (atomic charges, bond
orders, free valences, localization energy, superdelocalizability indices, frontier orbital
indices). In Chapter 2, Trinajstic, Mihalic, and Graovac go beyond reviewing the
isomorphism between graph-spectral theory and the HMO theory, and beyond the
discussion on the structure of the Huckel eigenvalue spectrum. They present the modern
view on the interplay between graph theory and molecular orbital theory by reviewing
the achievements of recent years. Several major topics are included. The TEMO principle
(developed by the late Oskar Polansky, one of the pioneers of chemical graph theory)
allow, among other things, reactivity predictions for a special class of topological isomers
(topomers). The rule of topological charge stabilization of Gimarc is a reliable guide in
predicting relative stabilities of various heteroatomic isomers. The graph-theoretical
assessments of the HOMO-LUMO separation and absolute hardness of altemant
hydrocarbons pave the way for future achievements in this area.
In Chapter 3, Rousseau and Lee present a form of Huckel theory, termed second
moment scaling, which had been developed earlier by Burdett and Lee and has proven
ix
x PREFACE

successful in both rationalizing and optimizing the structure of molecules and solids.
Roussseau and Lee go beyond the reviewed previous work in this area and apply the
method to obtain the actual shape of the electronic energy surface as a function of
geometry for two important classes of compounds (boranes and carboranes). The
minimum energy geometries, electron density contour maps, and reaction paths thus
calculated are shown to be in reasonable accord with the ab initio method. The
topological method used may be regarded as a third-generation Huckel method,
applicable to covalent and metallic (but not ionic) compounds that are formed of main
group atoms and transition metals.
Chapter 4 by R. B. King summarizes topological and graph-theoretical aspects of
isomerization reactions of polyhedral molecules (both coordination and cluster
polyhedra). The microscopic approach is discussed, in which the details of polyhedral
topology are used to help elucidate which types of single isomerization steps are possible.
The reader may gain experience in using specific techniques and processes, such as the
Schlegel diagrams, the Gale diagrams, and the dsd-processes, as well as learn about
exciting developments in this area in which the author is one of the major contributors.
The earlier macroscopic approach, which uses the so-called topological representations
(reaction graphs) to show the relationships between different permutational isomers, is
also reviewed.
The macroscopic approach just mentioned is further detailed by Balaban in Chapter 5,
a chapter devoted to reaction graphs in both organic and inorganic chemistry. The author
was the first chemist to apply graph theory to isomerization processes (interconversions
of carbenium ions), doing so as early as 1966. The reader will find in this chapter the
complete and intriguing story of the rearrangements via carbocations with a particular
emphasis on those leading to diamond hydrocarbons and their derivatives. Reaction
graphs dealing with inorganic compounds describe different classes of rearrangements of
complexes (mainly metallic ones) with various geometries. The chapter is rich in
illustrative examples demonstrating that a chemist applying this graph-theoretical
technique may gain a closer insight into rearrangement mechanisms and be enabled to
indicate likely intermediates. .
Chapter 6 by Mezey initially offers a summary of the previously developed topological
methodology for treating three-dimensional molecular shapes and their changes. The
mathematical formalism developed is formulated in terms of contour surfaces of
electronic charge densities or, alternatively, of molecular electrostatic potential contours.
Chemical reactivity is thus regarded as the strongest change in molecular shape within
a series of similarly treated but less pronounced changes like conformational and
vibrational-rotational ones, as well as electronic excitations. The second part of the
chapter presents a newly developed method for representing molecular shapes by the
topological patterns of contour surfaces of three-dimensional nuclear potentials.
Computationally simple, the new technique is extended for the modeling of shape
changes (reaction paths) in chemical processes.
A quite different topological approach to chemical reactions is advanced in Chapter 7
by Babaev. Proceeding from the classical picture of molecules with localized bonds,
described by multigraphs with loops and then by graphoids, Babaev introduces the
concept of two-dimensional manifolds of surfaces. This novel concept characterizes
chemical species in a highly generalized manner by several topological invariants. The
well-known empirical types of chemical similarity (e.g., isoelectronic, isostructural,
homological) are thus shown to result from topological homeomorphisms; they all
conserve the Euler characteristic of the respective surfaces. Moreover, the author proves
the invariance of this topological characteristic in any chemical reaction involving
PREFACE xi

molecules with localized bonds, a result that might be termed a principle of conservation
of molecular topology in chemical reactions. The reader may thus gain an exciting and
unusually general view on reactivity in chemistry as a whole.
Chapter 8 by Mekenyan and Basak begins by reviewing some of the basic principles
underlying the topological nature of chemical reactivity. Topological indices, one of the
powerful tools of graph theory, ~e then introduced on this basis. The most common
indices are classified and formulated in several large groups which also distinguish
between global molecular, fragment, and atomic indices. Some of the first electronic
indices of reactivity, derived within the HMO theory, are also mentioned, owing to the
topological origin of the Hiickel matrix. Examples are presented of successful modeling
of various reactivity effects in different branches of chemistry including environmental
chemistry, toxicology, and drug-receptor interaction, along with some topology-based
reactivity rules and relationships.
Chapter 9 by Temkin, Zeigarnik, and Bonchev centers on progress in the mechanistic
and kinetic studies of complex reactions as a source of information on the reactivity of
intermediates and their elementary steps. The graph-theoretical concept of the topological
identifier, which produces two general principles (simple bond-change topology and
bond-change compensation topology), is first developed for identifying, classifying, and
enumerating elementary reactions. Then, the formalism of kinetic graphs and bipartite
graphs is applied to the classification, coding, and enumeration of linear and nonlinear
reaction mechanisms. The authors also introduce and discuss the concept of mechanistic
topological structure (the reactant interrelations, the number and kind of reaction routes,
and their mutual connectedness), an aspect of complex chemical reactions that was
largely neglected in the past. A topological characterization of the four major classes of
complex reactions (noncatalytic, noncatalytic conjugated, chain, and catalytic reactions)
is presented on this basis.
In conclusion, we would like to thank our authors and express the hope that the
material presented will find resonance with our readers and prompt their own
contributions to the field. If this proves to be the case, then the aim of this volume will
be fulfilled. .

Danai! Bonchev and


Ovanes Mekenyan
INTRODUCTION TO GRAPH THEORY

Haruo HOSOYA
Ochanomizu University
Department of Information Sciences
Bunkyo-ku, Tokyo 112, Japan
1. Chemical graph theory

1.1. PURPOSE AND PREMISES

In this book you are invited to the world of the application of the graph
theory to chemistry, especially on the problem how the topology of a
molecule determines its reactivity toward a specific reaction and how the
graph theory helps you understand these relationships. This introduction to
the graph theory was written for the purpose that a chemist or chemist-to-
be will be relaxed to think of applying the graph theory to one's own
problem.
There have already been published many books and monographs on the
application of the graph theory to chemistry from various standpoints
(Balaban, 1976, Graovac et aI., 1977, Trinajstic, 1983, 1992,
Balasubramanian, 1985, Tang et aI., 1986, Rouvray, 1990, Bonchev and
Rouvray, 1991). The readers can consult them for those interesting
problems which could not be introduced in this small article.
As a peculiar way of guiding the readers to the graph theory through the
gate of this chapter you are supposed to have the least knowledge of the
Huckel molecular orbital (HMO) theory applied to conjugated hydrocarbon
molecules (Huckel, 1931). That is, you are supposed to know that the
solutions, {£n=a+xnl3}, of
Ll(e) = det{ (a-e)E+ (3A}= 0, (1-1)
represent the energies of the orbitals for accommodating the electrons in
the molecule. Here A and E are, respectively, the unit and adjacency
matrices, which will be explained later, and a and f3 are the parameters
called the Coulomb and resonance integrals, respectively. Further, you are
supposed to know that the set of the eigenvectors which are automatically
obtained in the process of diagonalizing the secular matrix represent the
coefficients (Cor) of the molecular orbitals (MO) expressed in terms of the
linear combination of atomic orbitals (LCAO).
In the ground state the lower half (in energy) of the MO's are doubly
occupied by electrons. By taking the product-sum of the edge of the
corresponding chemical graph, one can obtain the bond order (Prs), which
is a measure of the contribution of the bond rs to the stabilization of the
molecule caused by the delocalization of 7t-electrons.
Although the HMO theory was proposed to study the 1t-electronic system of
unsaturated hydrocarbon molecules, its formal application to other classes

D. Bonchev and O. Mekenyan (eds.), Graph Theoretical Approaches to Chemical Reactivity, 1-36.
© 1994 Kluwer Academic Publishers.
2 H. HOSOYA

of molecules, such as saturated hydrocarbons, has been shown to give


useful information to the understanding of the topological dependency of
various (not only electronic but also thermodynamic) properties of
molecules.
1.2. GRAPH AND GRAPH THEORY
A graph is a mathematical object abstracted from a set of relationships
among various things or concepts. The graph theory is said to have been
initiated by the famous Swiss mathematician Euler in eighteenth century
when he derived the solution for the "Eulerian circuit" for the seven
bridges in Konigsberg (Harary, 1969). Ever since this application-oriented
branch of mathematics has been growing up to what it is today not only by
the flashing of ingenious mathematicians but also by the awkward but
painstaking endeavor of mathematical chemists and theoretical physicists.
Thus nowadays it is not mere a branch of mathematics, but rather an
important methodology or a way of thinking for solving the problems in
every area of science and culture.
Chemists have long been using the structural formula to represent the
topological structure of chemical substances and also various types of flow
diagrams to discuss the reaction network. (Sinanoglu, 1975, Sinanoglu and
Lee, 1978, 1979). Further in all branches of science, a variety of diagrams
have been used for expressing the relation among concepts and ideas. The
essence of these inventions can be abstracted and formed into a
mathematical object which is called a graph, and many practical tactics and
proof techniques were found, developed and used independently but
commonly in quite different areas of sciences. That is the reason why the
knowledge of the graph theory is becoming essential for understanding a
wide variety of phenomena in all branches of chemistry. Illustrative
examples of graphs used in chemistry are shown in Fig. 1
In the graph theory a graph (G) is a set of vertices (points or atoms, V)
and edges (lines or bonds, E). This statement can be written in terms of the
mathematical languages as G=(V,E). Let the numbers of vertices and edges
be denoted, respectively, as Iv Iand IE l Although there are a number of
variations in the definition of an edge, an edge should always be terminated
by two vertices Vi and Vj' symbolically denoted as eij={vi,Vj}. If the two
terminals of an edge are identified, namely, if eii={vi,vi}, that edge is
called a loop.
In some cases edges may be directed or weighted. More than two edges,
i.e., multiple edges, may also be drawn between a pair of vertices if
necessary as the double and triple bonds in the structural formula which
has long been used by chemists. If, however, one only wishes to express
the carbon atom skeleton of a molecule, graphical difference disappears
between cyclohexane and benzene. In a case where confusion might occur,
one should clearly state what kinds of concepts are respectively meant by
the vertices and edges of the graphs to be discussed. Before discussing the
variations in the definition of edges, let us take a close look at several
fundamental ways for the representation and characterization of a graph.
INTRODUCTION TO GRAPH THEORY 3

H
I
H C
'C~ ..... C....
I II
H
~ I
C
C.... .....C
I
°
C:::::>I
...... 0 ......
°I
....C.:::.C .... C, C.........C
H
I
H C °
0 ............ 0

H
Structural Carbon (Molecular) Graph
Formula Network

Reaction Flow Diagram

Fig. 1 Various examples of graphs used in chemistry.

2. Representation and characterization of a graph


2.1. MATRlX AND POLYNOMIAL EXPRESSIONS
2.1.1. Adjacency matrix
A graph G is a mathematical object and can be represented either by a
geometrical or algebraic object, i.e., a matrix. Define an NxN adjacency
matrix, A, for a graph, of the order N= Iv I, with elements such that
Aij={ 1 for an adjacent pair of vi and vj> and 0 otherwise). In the simplest
and most common case, only the adjacency relation among the group of
vertices is concerned for discussing and discriminating the topological
structure of graphs, and the adjacency matrix A becomes symmetric, i.e.,
Aij=Aji for all pairs of vertices vi and Vj. The corresponding graph is non-
directed.
On the other hand, if from some reason Aij and Aji are intentionally
deemed to be different, the edge with Aij= 1 is replaced by a directed edge,
or an arrow, directed from vi to Vj in the graphical representation. The
graph thus constructed is called a directed graph, or simply a digraph. One
can draw multiple (directed or non-directed) edges if necessary (See Fig. 2
for various examples). In this case one may either assign the multiplicity
4 H. HOSOYA

L
0 0 0

/\ ii~ /\
0---0
2 3 <=> -1
0 ___ 0
2

Il il ~l
1 1 1 I 1

[:
0
iJ [~ 0
0 !J [~ 0
[:
0
-i [:
0
2

x3-3x-2 X3-I X3 _x 2+X_2 x3-3x x3-6x-4


(a) (b) (c) (d) (e)

Fig. 2. Various kinds of graphs, G, their adjacency matrices, A, and


the corresponding characteristic polynomials, Pa(x), (Eq. (2-1».
(a) non-directed, (b) directed, (c) directed with a loop, (d) edge-
weighted directed, and (e) non-directed with a multiple edge.
number of edges (n) or its square root (rn) for the matrix element in the
adjacency matrix. The assigned weights may be real or complex. Figure ld
gives an example of an edge-weighted directed graph (Hosoya and
Balasubramanian, 1989).
One can enjoy more freely the matrix and determinantal representations of a
graph as proposed by Spialter, who put the atomic symbol and the
multiplicity of the bond, respectively, into the diagonal and off-diagonal
elements of the determinant, and then expanded it into a polynomial
involving atomic symbols for coding the topological structure of a molecule
(Spialter, 1964) as

H 1 0 0 0
I C I 0 2
0 1 0 0 = CH 202 - CHO - H02 - SH 20 + 0 +4H
0 0 1 H 0
0 2 0 0 0

and 11 = HCl- 1.
17 CI

2.1.2. Distance matrix


If G is given, then A is uniquely determined, and vice versa. The distance
matrix D can be defined for G with elements, Dij=d(ij), the distance, or the
INTRODUCTION TO GRAPH THEORY 5

number of the least steps from vi to Vj. If a graph G is given, the matrix D
can be reproduced uniquely, while by wiping out all the elements in D
except for unity one gets A. Thus one can assert that the geometrical
objects, A and D, for G are mathematically equivalent.
0-0-0-0

A= G D=

1\
A .... D
2.1.3. Characteristic polynomial
Although an adjacency matrix unequivocally represents the topological
structure of a graph, the bits of information increase quadraticly with
respect to N. On the other hand, the number of chemical substances which
are duly stored in the accessible CAS (Chemical Abstracts Service)
database has recently exceeded ten million. Then one needs some
characterization for a graph, which is a mathematical abstraction from a
complicated representation of a given object. A naIve way for
characterizing a given matrix is to take its determinant.
Define the characteristic polynomial Pdx) of G as
Pdx) = (-l)Ndet(A-xE), (2-1)
where E is the NxN unit matrix and x a scalar (Collatz and Sinogowitz,
1957). Although we know that PG(x) does not have a one-to-one
correspondence with G (Harary et aI., 1971), it has been shown that it can
be used for rough characterization of a graph.

By substituting E=a.+(3x into Eq. (1-1) it is shown that the HMO energies
of the secular equation of ~(E) are nothing else but the spectra of PG(x) for
the graph corresponding to the carbon atom skeleton of a molecule
(Cvetkovic et aI., 1980). From this coincidence one can expect that some
information from the HMO scheme might be helpful for analyzing and
understanding certain kinds of graph-theoretical aspects of the relevant
graph, and vice versa. This is the reason why in this chapter we have
entered into the world of the graph theory through the HMO gate.
2.1.4. Distance polynomial
By using the distance matrix one can define the distance polynomial
(Hosoya et aI., 1973, Graham and Lovasz, 1978) as
Sdx) = (-l)Ndet(D-xE). (2-2)
A number of interesting relations have been known between the coefficients
of the distance polynomial and the topological structure of a graph (Graham
6 H. HOSOYA

et aI., 1977, Hosoya, 1988). Although the number of digits of those coef-
ficients rapidly increases with N, it is shown that the distance polynomial
cannot also uniquely determine the structure of a graph.

2.2. TOPOLOGICAL INDEX


The term topological index was proposed by the present author (Hosoya,
1971) for characterizing the topological nature of a graph. It is an integer
quite easily obtained from a graph by the specified recipe. Since then there
have been proposed more than one hundred different topological indices for
chemical graphs (Rouvray, 1991). Among them let us introduce here only
the Wiener number (W) (Wiener, 1947) and Z index (Hosoya, 1971), as
they are all integral numbers and have clear topological interpretation.
Define the non-adjacent number p(G,k) for a given graph as the number of
ways for choosing mutually disjoint edges. The p(G,O) is defined to be
unity for all the cases, p(G,1) the number of vertices, and the last entry
p(G,m) (m=max(k» the perfect matching number, or the number of the
Kekule structure. Define the Z-counting polynomial QO<x) for G with the
set of p(G,k)'s in the following manner,
QO<x) = :f. p(G,k) xk.
k=O
(2-3)
The Z-index is finally defined as the total sum of the p(G,k) numbers for G
as
Qd 1) = L
m

ZG = p(G,k) (2-4)
k=O

The idea, group-theoretical foundation, and enumeration technique of the


counting polynomial were first introduced by P6lya into the discrete
mathematics, when he solved the enumeration of the number of isomers of
the hydrocarbons and the related substances (P6Iya, 1936). This is the
monumental work to have opened the common place for cooperative study
by chemists and mathematicians.
The Wiener number is the half sum of the off-diagonal elements of the
distance matrix D for G (Hosoya, 1971), or
W = L d(i,j),
i,j
(2-5)

where d(i,j) is the shortest distance between the two vertices Vi and Vj .

Besides these topological indices the connectivity index proposed by


Randic is widely used (Randic, 1975). For other indices consult the
standard textbooks and review articles (Trinajstic, 1983, 1992, Balaban et
aI., 1983).

By using the set of p(G,k) numbers the following counting polynomial has
been defined,

L
m
(XG(x) = (_1)k p(G,k) xN-2k, (2-6)
k=O
INTRODUCTION TO GRAPH THEORY 7

which is essentially the same as the Z-counting polynomial. Although it has


been called either acyclic polynomial (Gutman et aI, 1977) or reference
polynomial (Aihara, 1976), it is now widely called the matching
polynomial (Farrell, 1979). Interesting mathematical properties of the
matching polynomial including the relation with the characteristic
polynomial and stability of the molecular graph have been discovered and
discussed for various series of graphs (Trinajstic, 1992).

2.3. VARIOUS SERIES OF GRAPHS

Now we are ready for realizing and comparing the graph-theoretical aspects
of various series of graphs.

2.3.1. Path graph and tree graph


A path graph SN is a graph composed of N successively joined vertices, the
lower members of which are given in Table I together with their p(G,k)
numbers, Z index and Wiener number. The Z indices of the series of SN
form the Fibonacci numbers, {Fn}, with the following definition,
FN = F N- 1 + FN-2 (N2::2)
FO = Fl = 1. (2-7)
A tree graph T is a graph without any cycle, which is composed of a group
of path graphs. The characteristic polynomial of a tree graph can be
expressed in tenns of the non-adjacent numbers as
[N/2]
Pdx) = L (-I)k p(G,k) x N-2k (2-8)
k=O

Especially for the path graph, SN, with N vertices, the characteristic
polynomial can be expressed by a closed fonn (Hosoya, 1971) as

P s (x) = L
[N/2]
(_1)k
(N -k) x N - 2k (2-9)
N k~ k

By using this relation one can obtain straightforwardly the characteristic


polynomials of the lower members of path graphs given in Table 1

2.3.2 Cycle graphs


A cycle graph C N is a graphs composed of N circularly joined vertices, the
lower members of which are given in Table 2 together with their p(G,k)
numbers, Z-index, and Wiener number (Hosoya, 1971). The Z indices of
the series of C N form the Lucas numbers, {L N }, with the following
definition,
LN = L N_1 + L N_2 (N2::2) (2-10)
Lo = 2, L} = 1.
Although the series of LN is defined from N=O, the cycle graph begins
from N=1.
8 H. HOSOYA

Table 1. Lower members of path graphs and their characteristic


quantities.

p(G,k)
N SN ZG w
k=O 2 3

0 <p a) 1
0 1 1 0

2 0-0 1 2
"..0,
3 0 0 2 3 4

4 0
",0, ...... 0
0
1 3 5 10
/ 0 , ",0,
5 0 0 0 4 3 8 20
..... 0 ........... 0, /0
6 0 0 0 1 5 6 13 35
..... 0, .... 0, ..... 0,
7 0 0 0 0 6 10 4 21 56
a)
Vacant graph.

Table 2. Lower members of cycle graphs and their characteristic


quantities.

p(G,k)
N eN Zo w
k=O 2 3
1 0 0

2 0 2 3
0
3 / \ 1 3 4 3
0-0
0-0
4 I I 1 4 2 7 8
0-0
... 0,
5 0
\
0
I
1 5 5 11 15
0-0
0-0,
6 0 0 1 6 9 2 18 27
0-0
INTRODUCTION TO GRAPH THEORY 9

The relation between FN and L N,


LN = FN + F N_2 (2-11)

and FN = (3L N + L N_1)/5 (2-12)


are straightforwardly derived from their definitions. Especially Eq.(2-11)
is a direct consequence from the inclusion-exclusion principle to be
explained in the later section as in Fig. 3.

Fig. 3 Relation between the cycle graph Cn and path graphs Sn.

The characteristic polynomial of a cycle graph eN can be expressed in a


closed from as (Hosoya, 1972)
[N/2}
P CN (x)=t;(-I)k
(N k-k) x N - 2k _2 (2-13)

In general the characteristic polynomial of a non-tree graph can be


expressed by using the p(G,k) numbers of the original graph and the set of
subgraphs derived by deleting the component cycles.
Sachs has developed an elegant theory for obtaining the coefficients of the
characteristic polynomial of a given graph by defining the so-called Sachs
graph (Sachs, 1964). Since this method has been described in many
textbooks (e.g., TrinajstiE, 1992), it is not repeated here. Although all the
contributions from every component including the rings to the value of the
coefficients of the characteristic polynomial can be clarified by this
formulation, it is not practically excecuted for larger graphs because of the
lack of any recursive relation.

2.3.3. Regular graph and complete graph


The number of the neighboring vertices to a certain vertex is called degree.
If typical organic compounds are considered, the degree of all the vertices
in the corresponding graphs are limited not to exceed four. A graph in
which every vertex has the same degree or valency d is called a regular
graph of degree d. A cycle graph is a regular graph of degree two. A
regular graph with N vertices and d=N-I is called a complete graph, K N .
Lower members of KN are given in Table 3 together with their topological
indices.
10 H. HOSOYA

Table 3. Lower members of complete graphs and their matching


and characteristic polynomials.

N G <Xa(x) PG(x)

0 <I>

1 • x x
2 x2_1
2 e-----. X -1

3
~ 3 3
x-x x3-3x-2

4
A x4-6x 2+3 x4-6x 2-8x-3

5
~ x5-lOx 3+15x x5 32
-lOx -20x -15x-4

6 @ x6-15x4+45x 2-15 x6 43
-15x -40x -45x 2-24x-5

3. Realization of a graph
3. 1. SYMMETRY OF A GRAPH

3.1.1 Topological and geometrical symmetries of a graph


Since a graph is a mathematical object representing the neighboring relation
among a set of concepts irrespective of their physical reality, the
component edges do not necessarily have any metric meaning as they stand.
However, one often needs to discuss the symmetrical structures of a given
graph by assuming an appropriate geometry constructed within the frame of
the adjacency relation in the set of the component vertices. In the following
two examples it will be shown that realization of the lower topological
symmetry can sometimes yield better results without recourse to the use of
the highest geometrical symmetry (Hosoya, unpublished).

First consider the complete graph, K 4 , for which there is no a priori way
of drawing. Then as in Fig. 4 one can draw several graphs whose
geometrical structures belong to different point groups, such as T d (with
24 symmetry elements), D4h (16), D3h(l2), etc. Among them the Td and
D3h structures are 3-dimensionally, or geometrically possible, while the
INTRODUCI10N TO GRAPH THEORY 11

Fig. 4 VariOllS topological symmetries of the


complete graph K4'
D4h structure is possible only in topological sense. However, D4h has the
highest rotational symmetry of C4, which is not contained in other
structures, such as T d that has the largest number of symmetry elements
for this network. Thus by using the topological symmetry of D4h one can
efficiently factor out the secular determinant or characteristic polynomial of
fourth order into four first order polynomials. Although the details cannot
be explained here, just by modifying the standard recipe (Heilbronner,
19S3, Hosoya et aI., 1987), one can easily obtain the following resuslt,
4 k1t
PG(x) = x4 - 6x 2 - 8x - 3 = II(x-2cos--cosk1t}
k;) 2
= (x+l)3(x-3) (3-1)

Of course, by using the T d symmetry the same result should be obtained as


above, but after a lengthy and cumbersome calculation involving complex
numbers.
Similarly as shown in Fig. S the skeleton of a regular dodecahedron (lh
symmetry with 120 symmetry elements) can also be drawn in several ways,
such as the symmetrical structures of DlOh (40), DSh(20), D3h(12),
C2h(4), etc. By using the topological symmetry of DI0h one can quite
easily factor out the characteristic polynomial of the order 20 into ten
quadratic polynomials as
10
PG(x) = II {x 2 - 2(2cos 2 k9 + cosk9 -1)x + (8cos3 k9 - 4cosk9 -I)}
k;)

with 9 = relS (k=1,2,"',1O), (3-2)


yielding
PG(x) = x4(x-l)S(x+2)4(x-3)(x2-S)3
= x20-30x 18+37Sx 16_24x lS-2S40x 14+480x 13+1009Sx 12
-3760xl1_23S02xl0+14400x9+2890Sx8_27000x7_11400x6
+20000xS-6000x4

In this formula useful informations on the topological structure of the


12 H. HOSOYA

Fig. 5 Various topological symmetries of dodecahedron


graph.
dodecahedron are contained. For example, the highest power N=20
represents the number of vertices, the absolute value of the coefficient of
the term x N - 2 is equal to the number of edges, the coefficient of the term
x N -5 indicates the double of the number of pentagons, the power of the last
term counts the number of the NBMO's, or root x=O, and its coefficient is
the product of all the roots except for NBMO's, etc.
It is interesting to compare the Po(x) with the QO<x),

Qo(x) = 1+30x+375x 2 +2540x 3 + 10 155x4+244 74x 5 +34805x6


+27300x7+10260x8+1400x9+36xlO,

especially for the coefficients of the first four terms.

The coefficient, p(G,m), of the last term of QO<x) or (Xo(x) is equal to the
number of the Kekule structures, or the perfect matching number, of the
dodecahedron. In general the characteristic polynomial of a highly
symmetrical graph can be highly factored out. On the other hand, the
corresponding matching polynomial is not usually factorable. However, the
coefficient of the last term of the matching polynomial of a highly
symmetrical graph is found to be highly factorable (Hosoya, 1986). For
example, the Kekule number of the truncated icosahedron, or the perfect
matching number of the pattern of the soccer ball, is equal to 12500=2255,
and that of.the truncated dodecahedron is 2048=2 11 .
INTRODUCTION TO GRAPH THEORY 13

3.1.2. How to find the symmetry of a graph?


In the above section we could factor out the characteristic polynomial,
because the symmetry of the graph was already known. Both in the graph
theory and chemistry it is a difficult and challenging problem to find the
symmetry of a given graph G or its adjacency matrix A (Randic, 1977,
Randic and Wilkins,. 1980). Since both the two graphs shown in the above
section are regular polyhedra, all the component edges as well as the
vertices should be equivalent within the respective graphs, irrespective of
the way of vertex-numbering or projection. In terms of the graph-
theoretical terminology, both the vertex and edge topicities of the graph of
a regular polyhedron are one or unity. However, this is a necessary
condition for a graph to be a regular polyhedron but not a sufficient
condition.
The electronic distribution in a molecule should obey the symmetry of the
geometrical structure of the molecule. However, if a tight-binding
approximation, such as the Huckel MO method, is chosen, the obtained
results should reflect the topological symmetry of the corresponding
network. Thus even if the graph concerned is nothing to do with either the
x-electronic structure or geometry of a molecule, the property of the
wavefuncitons, or the eigenvectors, obtained by formal application of the
HMO faithfully reflect the topological symmetry of the graph. Then if one
calculates the charge densities of all the vertices (atoms) and bond orders
of all the edges (bonds) for the graph of high symmetry, the values of
these quantities should be highly degenerate and distributed very evenly all
over the graph.

Note, however, that for an altemant hydrocarbon the pairing theorem


ensures the uniform charge density distribution (Coulson and Rushbrooke,
1940). Namely, for a bipartite graph, a graph without an odd-membered
cycle, the charge density is shown to be uniform as long as proper caution
is paid for distributing electrons over the degenerate highest occupied
molecular orbitals (HOMO's). For example, if the recipe of the HMO is
formally applied to the K4 graph given in Fig. 4, one can obtain the four
"wavefunctions" as in Fig. 6, where all the circles, filled (-) and vacant
(+), are equal and have the value of 112. If you have a bit of experience in
HMO calculations, these MO's can be drawn even without recourse to a
table-top calculator. Further, as seen in Fig. 6, the orbital energies of these
MO's can also be obtained by pencil-and-paper calculation to be x=3 (or
£=0.+313) and triply degenerate x=-l (£=0.-13).

From these MO's the bond orders of the component bonds can also quite
easily be calculated, just by noticing the fact that each of the triply
degenerate orbitals is to be occupied by 2/3 1t-electron to attain the
uniform (or neutral) charge distribution. For example, the bond order of
bond 12 can be calculated as
P12 = 2x1l2x1l2 + 2/3x1l2xl/2(lxl-lxl-lxl)
= 112-116 = 113,
14 H. HOSOYA

and the same values can be obtained for all the component bonds, meaning
that the edge-topicity of this graph is unity. Finally one can realize the high
symmetry, 1<4, of this graph.

In the above case the graph was so small that one can recognize the high
symmetry of the graph even without recourse to the above-mentioned
procedure. However, in the case where a rather complicated structure as
dodecahedron is drawn so deformedly that no proper symmetry element can
easily be recognized, one can use a computer to get the eigenvalues (orbital
energies), eigenvectors (wavefunctions), and the related quantities as the
charge density and bond order. The method outlined here is too brute to be
intorduced in a usual textbook of the graph theory. However, the clever
readers would realize how this method works well for a complicated
problem.

1~2
3 4
lSI ~ ~
MO. 1 2 3 4
X 3 -1 -1 -1

Fig. 6 MO wavefunctions of K4 graph. 0: 112 .: -1/2

3.1.3. Highly symmetrical graphs


Besides the regular and semi-regular polyhedra a number of interesting
graphs of high symmetry have been known, some of which play a very
important role in the discussion of reaction processes (Balaban, 1966,
Balaban and Kerek, 1974). The most famous example is the Petersen graph
(See Fig. 7), composed of two entangled pentagons (Petersen,. 1898,
Balaban, 1966, Dunitz and Prelog, 1968, Randic, 1977). The largest
topological symmetry of this graph is D Sh ' with 12 pentagons, 10
hexagons, etc., but the edge topicity is as high as three.

Mathematically and chemically interesting networks are found to be


generated by defining a generalized Petersen graph, P(m,n), with m>2n, so
that the original Petersen graph is denoted by P(5,2) and becomes just a
member of the fantastic big family. Draw a cycle em' and from each vertex
extend a brach of a unit length toward the interior of the cycle. Then
choose a terminal, and draw an edge toward the clockwise n-th terminal.
INTRODUCTION TO GRAPH THEORY 15

Petersen
graph

P(5,2) P(5, 1)

P(6,1) P(6,2)

P(8,3) All the edges are equivalent.

P(12,5) All the edges are equivalent.

Fig. 7 Examples of generalized Petersen graphs.


16 H.HOSOYA

Continue this process until one gets a cubic graph, i.e., a graph whose
vertex degrees are all three. Necessary caution to be taken for the cases
with m=2n and 2n+2 is illustrated in Fig. 7, where P(6,1), P(6,2), P(8,3)
and P02,5) are also shown. The cube and dodecahedron graphs (Fig. 5)
are, respectively, P(4,1) and P(lO,2).

The two graphs, P(8,3) and P(l2,5), have very interesting features in
common. Namely, they can be derived from the honeycomb lattice and the
edge topicity is unity (Hosoya, 1993). Although at present no actual
reaction network is known to be associated with these highly symmetrical
graphs, their topological structures are worthy of further study.
3.2. ISOMORPHISM OF A GRAPH
3.2.1 Comparison of two graphs
Since a graph is a topological object which merely expresses the adjacency
relation for a given set of vertices or concepts, so many different ways for
representing the same set of the adjacency relation may be possible as in
Figs. 4 and 5. If one-to-one correspondence between the adjacency
relations in the two given graphs Gland G2 is found, they are called
isomorphic to each other. Examples of isomorphic graphs are given in
Figs. 4 and 5. As stated before a graph G with N vertices can be expressed
by an adjacency matrix A with N! different ways for numbering the
vertices. However, the mathematical properties of a matrix are independent
of the numbering of vertices. In other words isomorphic graphs have
exactly the same topological quantities.
Two graphs are called homeomorphic with each other if they can be
reduced into the same graph by short-cutting without changing their
topological structure. A pair of path graphs, Sj and Sk, with different
number of vertices are not isomorphic, but homeomorphic with each other.
By successive deletion of an inner vertex followed by short-cutting both
the graphs can be reduced into S2 (See Table 1). Similarly any two cycle
graphs, Cj and Ck, can be reduced into C3 (Table 2) and are homeomorphic
with each other. The concept of homeomorhism might be important for the
analysis of a reaction network.

Given a pair of graphs with the same number of vertices and edges, one
often needs to judge if they are isomorphic or not. For a pair of relatively
large graphs no efficient algorithm other than exhaustive search is known
for judging if they are isomorphic. Although the characteristic polynomial
cannot necessarily distinguish the topological structure of a graph, it can
be used for rough discrimination. That is, there are so many examples
where two or more different graphs have the same characteristic polynomial
(See Table 4). One may say that for larger graphs redundancy of the
characteristic polynomial, r(Pdx», is not negligible. This is also the case
with almost all of the topological indices, such as the Z-index, or Z-
counting polynomial, QG(x). Actually, for tree graphs we have r(Z) >
r(Qd x » = r(Pdx» (Mizutani et aI., 1971).
INTRODUCTION TO GRAPH THEORY 17

Table 4 Various pairs of graphs with similar topological quantities

No. Pair of graphs ZG QG PG W SG

1
0- 0 [>-0-0
0
0 0 X X X

0 »-0
2 [>-~o 0 X X X X

0
0-0-/ 0- 0
»-
3 cf\o X X 0 X X

0 0
4
0,'L>--0-o-o 0\00
o I
0 0 0 X X

0 o~o

5 0>-0 b> 0 0 0 X

o means that the pair of graphs have the same value or formula, while
X

X means different values or formulas

The equality in the above relation comes from the fact that the set of the
coefficients of QG(x) are identical to that of Pdx) for tree graphs, whereas
the inequality between Z and Qdx) has been derived by contracting the
topological information from the latter quantity to the former. In the
extensive tabulation of non-tree graphs the following inequality is
observed, r(QG(x)) > r(Pdx» (Kawasaki et aI., 1971).

If we are given such a topological quantity, A, that characterizes the


topological structure of a graph with a practically negligible value of r(A),
it can be used for discriminating or coding the large set of graphs. Even if
both the two topological quantities, A and B, by themselves do not have
high graph-discriminating power, their combined use might give a
promissing effect as, r(A B)«r(A) r(B). This is a standard technique in
the quantitative structure-activity relationship (QSAR) or quantitative
structure-property relationship (QSPR) study (Cramer, 1980, Gao and
Hosoya, 1988).
18 H.HOSOYA

3.2.2. Isospectral graphs


A pair of distinct graphs are called isospectral, or cospectral, when their
characteristic polynomials coincide with each other (Harary et aI., 1971,
Gutman and Trinajstic, 1973, Herndon, 1974, TrinajstiC, 1992). Several
examples are shown in Table 4. Although the electronic spectrum of a
molecule is determined not only by its topological structure but also by its
geometrical structure, especially in the spectral intensity. Then practically
in no case a pair of isospectral graphs would give the same optical
spectrum by irradiation. However, this fact does not necessarily mean that
the concept of isospectrality gives no strong impact to the new field of
mathematical chemistry.

A number of interesting analyses have been introduced to show the


algorithm for designing as many pairs or groups of isospectral graphs by
using the pivot atoms called isospectral points (Herndon, 1974). For
example one can understand the isospectral property of the third pair of the
graphs in Table 4 by decomposing the characteristic polynomial into the
product sum of those of the component subgraphs as in Fig. 8 according to
the standard procedure (Heilbronner, 1953). On the other hand that the
fourth pair of the graphs in Table 4 are isospectral is accidental.

isospectral graphs

c$C)---I O(J-/ Y
o-J
cut
II II II

(\-ri( =
~cut~ 0

isospectral points

Fig. 8 Diagram showing the isospectral property of a pair of


isospectral graphs. The second tenns in the right-hand-
sides indicate the existence of the isospectral points
marked with x.
INTRODUCTION TO GRAPH THEORY 19

However, it is interesting to observe in Table 4 that these two graphs have


not only the same characteristic polynomial but also the same Z-counting
polynomial, and accordingly, the same Z-index. One can find other pairs of
graphs with the same property (See the fifth pair of graphs in Table 4).

Since the coefficients of the distance polynomial are much larger than those
of the corresponding characteristic polynomial, one can expect more
efficient discriminating property than the characteristic polynomial. As
seen in Table 4 all the isospectral and/or iso-Z pair graphs can be
discriminated by the distance polynomial, SG(x). However, as will be
shown below the distance polynomial does not also uniquely determine the
topological structure of a graph.

3.2.3. Topological twin graphs


Very recently such an interesting pair of graphs were discovered that have
only eight vertices but have the same P G(x), QG(x), and SG(x) as shown in
Table 5 (Hosoya et aI., 1993). The only difference between them is the
connectivity index derived from different edge distribution. Let us call
such a pair of highly similar graphs in topological sense a pair of
topological twin graphs. The number of larger topological twin graphs is
expected to increase with the size of the graph, but they are not yet
obtained.

Table 5 Topological quantities of the pair of topological twin graphs.

Z-counting polynomial (Matching polynomial)


QG(x) = 1 + 15x + 61x 2 + 65x 3 + 9x 4
Characteristic polynomial
PG(x) = x 8 - 15x6 - 12x 5 + 41x4 + 28x 3 - 35x 2 - 12x + 9
Distance polynomial
SG(x) = x 8 - 67x 6 - 340x 5 - 643x 4 - 436x 3 + 21x2 + 76x - 7
Topological indices (Z and Wiener)
ZG=151 W=41
20 H. HOSOYA

3.2.4. Isomer graphs


If two chemical substances happen to have the same numbers and types of
component atoms, they are called isomers with each other. Such a set of
graphs derived from isomers are called isomer graphs. In Table 6 the
numbers of structural isomers of alkanes (saturated hydrocarbon molecules
without any ring) and of acyclic saturated alcohols are given. In 1936 a
Hungarian mathematician George P6lya established a new methodology for
generating the structural isomers of hydrocarbons and the related series of
compounds (P6Iya, 1936). His theory is based on the argument of the
permutation group, and the cycle index is proposed to facilitate the
manipulation of the various versions of counting polynomials. By his
monumental work both the graph theory and combinatorial theory rapidly
grew up to what they stand today. Unfortunately, however, a majority of
chemists have not realized the importance of this epoch-making paper until
recently.

Table 6 Numbers of isomers of alkanes and saturated alcohols.

n C n H2n+2 CnH2n+lOH n C nH2n+2 C n H2n+lOH

1 1 1 11 159 1238
2 1 1 12 355 3057
3 1 2 13 802 7639
4 2 4 14 1858 19241
5 3 8 15 4347 48865
6 5 17 16 10359 124906
7 9 39 17 24894 321198
8 18 89 18 60523 830219
9 35 211 19 148248 2156010
10 75 507 20 366319 5622109

Ever since the enumeration of the number of possible isomers of certain


series of chemical substances has been one of the central problems in
mathematical chemistry (Read, 1976, Balasubramanian, 1990, Balaban,
1991). Then the recursive relations of the counting polynomials for the
isomers of a variety of acyclic hydrocarbons and related substances have
been obtained. However, for cyclic compounds including the aromatic
hydrocarbons, or hexagonal animals, the so-called combinatorial explosion
overwhelms the efficiency of the P6lya's method.
The condensed polycyclic aromatic hydrocarbons are classified into
catacondensed and pericondensed, or in he nomenclature by Balaban,
catahex and perihex, depending on the non-existence and existence of the
inner vertex (carbon atom) which is shared by three benzene rings (Balaban
INTRODUCTION TO GRAPH THEORY 21

and Harary, 1968, Balaban, 1969). By using a few steps of recursive


formulas the isomer number of catahexes with a gi ven number of hexagons
can be derived. On the other hand, counting the number of possible
perihexes is a tough project, and no
general solution has ever been obtained. Although some special series of
perihexes have been attacked, the isomer numbers known for lower
members of peri hexes are those obtained by the aid of computer searching
(Knop et aI., 1981, 1985, 1990).

Every isomer compound has its own geometrical structure and properties.
The chemical properties, such as reactivities, are largely determined by the
electronic property of a single molecule, whereas almost all the thermo-
dynamic properties, e.g., boiling point and density of liquid, are the
outcome of interaction among a vast number of the same species, as large
as 1020. However, it has empirically been known that there exist beautiful
correlations between the topological structure and various properties of
chemical substances, tempting the curiosity of scientists and mathema-
ticians. Besides fruitful results in mathematical chemistry, a new field of
QSAR or QSPR study has been cultivated to crop drug and reaction design
(Kier and Hall, 1976, 1986).

In this section only Table 7 is given for the readers, which compares
various topological quantities with boiling point amd density of liquid of
heptane isomers, which were taken from the extensi ve tabulation of
thermodynamic properties of hydrocarbons (Rossini et aI., 1943).
Table 7 Topological properties and chemical properties of heptane isomers.

Isomer (G) ZG W bp a pb dC

n-heptane 21 56 98.4 4 0.684


3-ethylpentane 20 48 93.4 6 0.698

3-methylhexane 19 50 91.9 5 0.687

2-methylhexane 18 52 90.0 4 0.679


2,3-dimethy Ipentane 17 46 89.7 6 0.695
3,3-dimethylpentane 16 44 86.0 6 0.693

2,4-dimethylpentane 15 48 80.5 4 0.673


2,2-dimethylpentane 14 43 79.2 4 0.674
2,2,3-trimethylbutane 13 42 80.9 6 0.690

a Boiling point (oC). b Wiener's path number. Number of pairs of


vertices separated by three edges. c Density of liquid (glm/) at 20°C.
22 H. HOSOYA

3.3. PLANARITY OF A GRAPH

3.3.1.Polyhedral graphs and a SchleJ?el diaJ?ram

A polyhedron is composed of more than four polygons in such a way that


every vertex is surrounded by at least three polygons. The smallest
polyhedron is the tetrahedron, whose vertex relations are depicted by the
K 4 .graph. The numbers of lower members of possible polyhedra are
tabulated (Engel, 1982). Graphical representations of smaller polyhdral
graphs are tabulated (Britton and Dunitz, 1973, Federico, 1975).
Topologically a polyhedron is regular if both the numbers of edges of all
the faces and degrees of all the vertices are, respecti vely, the same.
Geometrically a regular polyhedron should have the identical (or
congruent) regular polygons as faces. There exist only five regular
polyhedra, and are called the Platonic polyhedra. Among them the
dodecahedron has the largest number of vertices, 20, and its topology is
drawn in Fig. 4. The dodecahedron can be denoted as 53, as three
pentagons meet at every vertex. The icosahedron, 3 5 , is a dual of 53, as
they can be transformed into each other by exchanging their face relation
into vertex relation.

6 ~2 ~ non-planar

5~3
4
on torus
~ - 3 edges

@
3 5

~
2 4 6

~ - 1 edge

~ planar

Fig. 9 Relation among K 6, K 3,3 and K3 .


INTRODUCTION TO GRAPH THEORY 23

3.3.2 Planar graphs and the theorem of Kuratowski

As seen in Figs. 4 and 5 some polycyclic graphs, such as K4 and 53 can be


drawn with no edge crossing. A polyhedral graph drawn in this manner is
called a Schlegel diagram. One cannot draw a Schlegel diagram of KS nor
K6 (See Table 3). These two graphs are called non-planar, which can be
embedded in a plane or on the surface of a sphere so that no two edges
intersect. Whether a given graph is planar or non-planar can be judged by
the famous theorem of Kuratowski (Kuratowski, 1930). Namely, a graph is
planar if and only if it does not contain subgraphs which are homeomorphic
to Ks or K:~ 3'

The graph K 3 ,3 is called a complete bipartite graph. A bipartite graph is a


graph G whose vertex set V can be partitioned into two sets V 1 (starred)
and V 2 (unstarred) such that every edge in G connects V I and V 2. That is
no two vertices in VI (or V 2) are connected. If every vertex in VI (and V 2)
is connected to all the vertices in V 2 (and V I), the graph G is called a
complete bipartite graph and denoted by Km n' where m and n are the
numbers of vertices in V 1 and V 2. '

Let us try to embed the K6 graph in a plane by deleting as small number of


edges as possible. As seen in Fig. 9 by deleting three edges K6 is reduced
to K 3 ,3, which can be embedded on the surface of a torus but not in a
plane. Then the graph derived by deletion of an edge from K 3 ,3 becomes
planar, which is homeomorphic to K 3 .

4. Operation on graphs

4.1. OPERATION OF TWO GRAPHS

4.1.1. Graph operation

If various relations among the eXIstmg graphs are clarified through the
concept of operation on graphs, one can either develop global discussion or
get useful interpretation for disorderly piled up informations. Further,
consider such a case where one has to calculate the topological quantities
for infinitely large networks. Unless practically useful techniques, such as
recursive formulas, for calculating those quantities are known, one would
surely be lost in the jungle of the so-called combinatorial explosion.

One can construct larger graphs by the use of various operations on the
component small graphs, among which four fundamental operations, union,
join, product, and composition, will be explained here. Given two graphs,
G I and G 2 , to be operated by the operation F, then the third graph G 3 is
generated as
G 3 =F(G 1 ,G 2 ) (4-1)
Except for composition, an operation on graphs is usually commutative,
which can be expressed as
24 H.HOSOYA

(4-2)

The simplest operation is the union, which does not change the numbers of
edges and vertices as,
v 3 == vI + v 2 and e3 == e 1 + e 2
where vn and en' respectively, means the numbers of vertices and edges of
graph G n . This means that the operation union, U, simply takes two
disjoint sets of graphs, G 1 and G 2, as a disconnected graph G 3, or one can
express this as
U(G], G 2) == G] UG 2 (4-3)

In order for the discussion to be concrete, let us take Gland G 2, as the


smaller members of path graphs, S2 and S3' respectively (See Fig. 10).

a b 1 2 3
G I = S2 = K2 ------ G2 =S3 • • •
a b
a b
G l VG 2 ------
1 2
• •
3

G I +G 2
M 1 2 3

al a2 a3

G I xG2 CD
bI b2 b3

al a2 a3 la~ Ib

~
G1[G2] G2[Gtl 2a 2b

bi
b2 b3 3a 3b

Fig. 10 VariOllS operations of graphs.


INTRODUCTION TO GRAPH THEORY 25

4.1.2. Join and product


The operation join, J, differs from the union only in that all the vertices of
Gland G 2 are connected. By the use of mathematical notations J is
defined as
J(G 1 , G 2) = G 1+G 2 = Gl~G2 = G 1UG 2 U {(Xj,Yk)} (4-4)
Xj € G l' 1 ~ j ~ v]
Yk € G 2, 1 ~ k ~ v2·

Either notation + or $ is used. However, due caution is necessary lest the


former should be taken to mean the operation of union. A complete
bipartite graph Km,n is a join of two completely unconnected, e.g., edge-
less, graphs comprising, respectively, of m and n vertices.
The product, P, is defined in terms of the cartesian product of the two
vertex sets of two graphs, G 1 andG 2, as
P(G 1,G 2) = G 1xG 2 = G 1 ®G 2
= GIl U···Glk···UGlv2UG21 U···G2j""·UG2vI
where G lk and G 2j ,respectively, represents the replicas of G] and G 2.

4.1.3. Composition
This operation is a little complicated. The operation G I [G2] is defined as
follows. The two vertices XjY k and xmY n are connected, if and only if
either the edge (Xj'x m ) exists in GI or Xj=xm and the edge (Yk'Y n) exists in
G 2. Then one can see that G 2[Gd yields different result from G 1[G 2]. See
Fig. 10 for the different effects between G I [G 2] and G 2 [Gtl.

Table 8 Summary of four operations of graph.

Operation Graph Number of Number of edges


vertices
Gl VI el
G2 v2 e2
Union Gl UG 2 vl+v2 el+e2
Join Gl +G2 vI+v2 el+e2+el e2
Product Gl xG 2 vlv2 vle2+ v2e l
Composition Gl[G2] vlv2 v 1e2+ v22e 1
26 H. HOSOYA

By consulting this figure and Table 8 which summarizes the numbers of


vertices and edges derived from the four operations, the readers can realize
the meaning and difference of these operations.
4.2. POL YMERIZAnON OF GRAPHS.
Any graph can grow with itself as a repetitive unit to form a polymer
network. The path graphs and cycle graphs shown in Tables 1 and 2 are the
simplest polymer networks. If the mode of growing is regular as in these
two cases, one can quite easily formulate how the topological quantity
converges or diverges toward infinitely large networks. For the above two
series of graphs general expressions of various topological indices can be
obtained as in Table 9. General expressions and recursion formulas of their
characteristic polynomials are also known as in Eqs. (2-9) and (2-13).
From these analytical expressions one can realize how these quantities
increase with the size of the network, e.g., exponentially or explosively in
combinatorial sense. This information is important for discussing the
physical meaning and limitation of QSAR or QSPR studies and for the
normalization of these quantities.

Table 9 General expressions of various topological indices for


infinitely large path graphs and cycle graphs

Charcteristic Path graph Cycle graph


quantity SN CN

Pa(x) a) Eq. (2-9) Eq. (2-13)

Za b) (aN+I_~N+l)/-vs aN + ~N

N 3/8 for even N


Wa N(N 2 - 1)/6

(N 3_ N)/8 for odd N

X c) N/2 + (...[2 - 3/2) N/2

a) Recursion relation for path graphs: PN(x) = x PN_1(x) - PN_2(X).


For cycle graphs recursion is a little complicated.
b) a=(l+-J5)12 and 13=(l--B)12.
c) Randic's connectivity index.
INTRODUCTION TO GRAPH THEORY 27

The electronic properties of infinitely large polymer network have been


known to be independent of the boundary condition. This means that one
can use a cyclic boundary condition for calculating the wavefunctions for
an infinitely large polymer network. Based on the group theory a cyclic
boundary condition directly gives us a standard technique for obtaining the
characteristic polynomial of a periodic polymer network (Heilbronner,
1953, Hosoya et aI., 1987).

On the other hand, as seen in Table 9 the modes of divergence of several


topological quantities are different between the path and cycle graphs.
Further, in general the group theory cannot be applied to the enumeration
of combinatorial quantities. Then one has to seek out useful recurrence
relations for those topological quantities.

4.3. RECURRENCE RELATlON


4.3.1. Inclusion-exclusion principle
The most important concept practically used in the combinatorial and
graph-theoretical enumeration problems is the inclusion-exclusion principle
(Riordan, 1958, Liu, 1968), which will be explained by taking the
enumeration of the non-adjacent number, p(G,k), as an example (Hosoya,
1971, 1972).
The p(G,k) is the number of ways for choosing k disjoint edges from a
given graph G. Define two subgraphs G-I and Gel by choosing an
arbitrary edge I from G. By deleting I and leaving the two terminal
vertices one gets G-I, whereas the deletion of I together with all the edges
adjacent to I gives Ge/(See Fig. 11).

G G-l Gel
l - exclusive l - inclusive

Fig. 11 Inclusion-exclusion principle.

There are two and only two possibilities in choosing k disjoint edges,
whether they contain lor not. In other words, the sum of I-inclusive and
I-exclusive countings is equal to the value of p(G,k). They are nothing
else but the terms, p(Ge/,k-l) and p(G-I,k), respectively. Then we have
the recurrence relation,
28 H.HOSOYA

p(G,k) = p(G-I,k) + p(GeI,k-l) (4-5)

This relation automatically gives the recurrence relations for the two
counting polynomials, i.e., Z-counting and matching polynomials, as

Qdx) = Qa_/(x) + x Qaet(x) (4-6)


adx) = aa_/(x) - aGe/ex) (4-7)

Note that x in the second term of the right hand-side of Eq. (4-6) means a
selection of edge I as an entry among k disjoint edges from G. Figure 3
gives another example of application of these recurrence relations.
4.3.2. Operator technique and transfer matrix
Besides the above-mentioned recurrence relation a few other relations have
been known for the p(G,k) numbers and Z-counting polynomial. By
repeated use of these recurrence relations one can obtain these quantities
for relatively large networks with periodic structure. Whether useful recur-
rence relations for a given topological quantity exist or not depends on the
way how it is defined. For the characteristic polynomial recurrence
relations as the one introduced in Table 9 have been known. However, For
polycyclic graphs, even if they have periodic structure, paractical
application of the recurrence relation is usually a formidable task to be
performed even with a computer.
In order to overcome the difficulty in the case of polycyclic network
systems several mathematical techniques have recently been introduced,
such as the operator technique (Hosoya and Ohkami, 1983) and transfer
matrix method (Randic et aI., 1989).
Consider, for example, a series of polyacene graphs, or linearly growing
hexagonal animals, for which the recurrence relations of the Z-counting
(matching) and characteristic polynomials are to be sought. In both the
cases we are forced to solve simultaneous but entangled set of recurrence
relations for the three series of subgraphs, In' Ln' and N n , derived from
the polyacene graphs.

Namely, we have to solve


In = I n- 1 - 2xLn_l + (x 2 -l)N n
Ln = xIn - xN n_1 :t L n_1 (4-8)
N n = xLn+xL n_1 - (x 2 -I)N n_1
INTRODUCTION TO GRAPH THEORY 29

where In. Ln. and N n .respectively. stand for the matching polynomials of
the three series of graphs.

Then let us define a step-up operator 6 for promoting the n-th member of
(any kind of) counting polynomial Fn to the (n+ 1)-th member as

(F = I. L. and N) (4-9)

If one assumes that Eq. (4-9) is commonly applied to In. Ln. and N n • the
set of simultaneous recurrence formulas for a series of a family of
regularly growing graphs can be transformed into a set of simultaneous
linear equations involving the operator 6 as a variable. Then the necessary
condition for the variable 6 to be non-trivial is that the coefficient
determinant of Eq. (4-8) is zero. Namely.

0-1 2x _(x 2 -1)


-xO (0-1) x =0 (4-10)

and we have
6 3 - (x4 -5x 3 +3)6 2 + (x4-3x2+3)6 - 1 = O.
which gives the corresponding recurrence formula for the matching
polynomials of the three series of graphs as
Fn = (x4-5x 3 +3)Fn_l - (x4 -3x 2+3)F n_2 + F n-3 (4-11)

Next consider another type of I-dimensional hexagonal animals. which


grow up in such a manner that addition of a hexagonal cell is randomly
chosen from linear (I). helical (H). or zigzag (F) mode (Randic et al..
1989).

Linear growth (I) Helical (H) Zigzag (F)

The characteristic and matching polynomials of this aperiodic polymer


network can be easily obtained by the step-by-step multiplication of the
three kinds of transfer matrices just in the same order as the hexagonal
cells grow by combining the above three modes.
30 H. HOSOYA

4.5. FACTORIZATION AND KEKULE STRUCTURE

Not all but many of the graphs with an even number of vertices (N=2m) can
be spanned by m edges without leaving isolated vertices. The number of
ways for this type of selection is called the perfect matching number, and
can be expressed in terms of the non-adjacent number as p(G,m). In the
graph theory a graph with p(G,m):;toO is said to be I-factorable. A I-factor
is a set of isolated edges, as the degrees of all its component vertices are
unity, while any set of disjoint cycles is a 2-factor, as all the vertex
degrees are two.

A perfect matching pattern for a molecular graph corresponding to the


carbon atom skeleton of a conjugated unsaturated hydrocarbon as butadiene
and benzene is called a Kekule structure, or a Kekule pattern. The number
of the Kekule patterns is simply called the Kekule number, K(G).
Unsaturated hydrocarbon molecules are classified into Kekulean and non-
Kekulean according to the value of K(G) to be non-zero and zero,
respectively. The K(G) number of an acyclic unsatured hydrocarbon is
either one (as butadiene) or zero (See below). On the other hand, a large
family of the benzenoid hydrocarbons, or aromatic hydrocarbons, have a
variety of the K(G) numbers beginning from benzene with K(G)=2.

H H
H2C=C-C=CH 2 C=C-C=C K(G)=l

C
II
C
/C :----

/C
'C/
'-..;C
I
co K(G)=2

It is possible for a tree graph with even N to have zero K(G), as exempli-
fied below.
INTRODUCTION TO GRAPH THEORY 31

Any non-KekuIean graph with zero K(G) is unstable as inferred from


quantum mechanical discussions. This criterion can be applied to a wide
variety of cyclic hydrocarbon molecules including aromatic hydrocarbons .
For example, phenanthrene with K(G)=5 is expected to be more stable than
its isomeric anthracene with K(G)=4 .

There have been discussed a number of interesting problems on the Kekule


number itself and a variety of interesting methods for enumerating the
K(G) numbers, both from the standpoints of chemical stability of a
molecule and mathematical structure of polycyclic graphs, especially of
polyhex graphs, on which a number of interesting reviews have been
published (Trinajstic, 1983, Cyvin and Gutman, 1988, 1990).
Counting the number of K(G) is far more trivial. Paticularly this is the
case with carbon cages, such as buckminsterfullerene C60 for which K(G)
was calculated to be as large as 12500=22 5 5 (Hosoya, 1986).

Acknowledgment

The author of this chapter greatly appreciates Prof. Milan Randic for his
numerous and useful advices for improving the manuscript.
32 H. HOSOYA

References

Aihara, J. (1976) "A New Definition of Dewar-type Resonance Energy", L.


Am. Chern. Soc. 98, 2750-2758.
Balaban, A.T. (1966) "Graphs of Multiple 1,2-Shifts in Carbonium Ions
and Related Systems", Rev. Roum. Chim. 11, 1205-1227.

Balaban, AT. (1969) "Chemical Graphs. VII. Proposed Nomenclature of


Branched Cata-Condensed Benzenoid Polycyclic Hydrocarbons",
Tetrahedron 25, 2949-2956.

Balaban, A.T., Ed. (1976) Chemical Applications of Graph Theory,


Academic Press, London.

Balaban, AT. (1991) "Enumeration of Isomers", in D. Bonchev and D.H.


Rouvray (eds.), Chemical Graph Thoery, Abacus Press/Gordon & Breach,
New York, pp. 177-234.

Balaban, AT. and Harary, F. (1968) "Chemical Graphs. V. Enumeration


and Proposed Nomenclature of Benzenoid Cata-Condensed Polycyclic
Aromatic Hydrocarbons", Tetrahedron 24,2505-2516

Balaban, A.T. and Kerek, F. (1974) "Chemical Graphs. XX. Graphs of


Parallel and/or Subsequent Substitution Reactions", Rev. Roum. Chim.
19, 631-647.

Balaban, AT., Motoc,. I., Bonchev, D., and Mekenyan, O. (1983)


"Topological Indices for Structure-Activity Correlations", Top. Cur.
Chern. 114, 21-55.

Balasubramanian, K. (1985) "Application of Combinatorics and Graph


Theory to Spectroscopy and Quantum Chemistry", Chern. Rev. 85, 599-
618.

Balasubramanian, K. (1990) "Recent Chemical Applications of Computa-


tional Combinatorics and Graph Theory", in D.H. Rouvray (ed.), Compu-
tational Chemical Graph Theory, Nova Sci. Publ., Commack, New York,
pp. 67-104.

Bonchev, D. and Rouvray, D., Eds. (1991) Chemical Graph Theory:


Introduction and Fundamentals, Abacus Press/Gordon & Breach, New
York.
Britton, D. and Dunitz, J.D. (1973) "A Cmplete Catalogue of Polyhedra
with Eight or Fewer Vertices", Acta CO'st. A29, 362-371.
Collatz, V.L. and Sinogowitz, U. (1957) "Spektren endlicher Grafen",
Abh. Math. Semin. Univ. Hamburg 21,63-77.

Coulson, C.A and Rushbrooke, G.S. (1940) "The Method of Molecular


Orbitals", Proc. Cambridge Philos. Soc. 36, 193-200.
INTRODUCTION TO GRAPH THEORY 33

Cramer, R.D., III (1980) "BC(DEF) Parameters. I. The Intrinsic


Dimensionality of Intermolecular Interactions in the Liquid State", J. Am.
Chern. Soc. 102, 1837-1848.
Cvetkovic, D.M., Doob, M., and Sachs, H. (1980) Spectra of Graphs:
Theory and Application, Academic Press, Berlin.
Cyvin, S.J. and Gutman, I. (1988) Kekule Structures in Benzenoid
Hydrocarbons, Lecture Notes in Chemistry, 46, Springer, Berlin.
Cyvin, S.J. and Gutman, I. (eds.) (1990) Advances in the Theory of
Benzenoid Hydrocarbons, Topics in Current Chemistry, 153, Springer,
Berlin.

Dunitz, J.D. and Prelog, V. (1968) "Ligand Reorganization in the Trigonal


Bipyramid", Angew. Chern. IntI. Ed. Engl. 7, 725-726.

Engel, P. (1982) "On the Enumeration of Polyhedra", Discrete Math. 41,


215-218.
Farrell, E.J. (1979) "An Introduction to Matching Polynomial", J. Combin.
Theory B27, 75-86.
Federico, P.J. (1975) "Polyhedra with 4 to 8 Faces", Geom. Dedicata 3,
469-481.

Gao, Y.-D. and Hosoya, H. (1988) "Topological Index and Thermo-


dynamic Properties. IV. Size Dependency of the Structure-Activity
Correlation of Alkanes", Bull. Chern. Soc. Jpn. 61,3093-3102.
Graham, R.L.,Hoffman, A.J., and Hosoya, H. (1977) "On the Distance
Matrix of a Directed Graph", J. Graph Theory 1, 85-88.
Graham, R.L. and Lovasz, (1978) "Distance Matrix Polynomials of
trees", Adv. Math. 29, 60-88.
Graovac, A., Gutman, I., and Trinajstic, N. (1977) Topological Approach
to the Chemistry of Conjugated Molecules, Springer-Verlag, Berlin.
Gutman, I., Milun, M. and Trinajstic, N. (1977) "Non-Parametric
Resonance Energies of Arbitrary Conjugated Systems", J. Am. Chern. Soc.
99, 1692-1704.
Gutman, I. and Trinajstic, N. (1973) "Graph Theory and Molecular
Orbitals", Topics Curro Chern. 42, 49-93.

Harary, F. (1969) Graph Theory, Addison-Wesley, Reading, MA.


Harary, F., King, C., Mowshowitz, A, and Read, R.C. (1971) "Cospectral
Graphs and Digraphs", Bull. London Math. Soc. 3, 321-328.
Heilbronner, E. (1953) "Das Komposition-Prinzip: Eine anschauliche
34 H.HOSOYA

Methode zur elektronen-theoretischen Behandlung nichit oder niedrig


symmetrischen Molekuln im Rahmen der MO-Theorie", Helv. Chim. Acta
36, 170-188.
Herndon, W.C. (1974) "Isospectral Molecules", Tetrahedron Lett. 671-
674.
Hosoya, H. (1971) "Topological Index. A Newly Proposed Quantity
Characterizing the Topological Nature of Structural Isomers of Saturated
Hydrocarbons", Bull. Chern. Soc. Jpn. 44, 2332-2339.
Hosoya, H. (1972) "Graphical Enumeration of the Coefficients of the
Secular Polynomials of the HUckel Molecular Orbitals", Theor. Chim. Acta
25, 215-222.

Hosoya, H. (1986) "Matching and Symmetry of Graphs", Compo & Math.


Appls. 12B, 271-290.
Hosoya, H. (1988) "On Some Counting Polynomials in Chemistry",
Discrete Appl. Math. 19, 239-257.
Hosoya, H., Aida, M., Kumagai, R., and Watanabe, K. (1987) "Analysis
of the 1t-Electronic Structure of Infinitely Large Networks. I. Some
Remarks on the Characteristic Polynomial and Density of States of Large
Polycyclic Aromatic Hydrocarbons", J. Comput. Chern. 8, 358-366.
Hosoya, H. and Balasubramanian, K. (1989) "Computational Algorithms
for Matching Polynomials of Graphs from the Characteristic Polynomials
of Edge-Weighted Graphs", J. Comput. Chern. 10, 698-710.
Hosoya, H., Murakami, M., and Gotoh, M. (1973) "Distance Polynomial
and Characterization of a Graph", Natural Sci. Rept. Ochanomizu Univ.
24, 27-34.

Hosoya, H., Nagashima, U., and Hyugaji, S. (1993) "Topological Twin


Graphs. Smallest Pair of Isospectral Polyhedral Graphs with Eight
Vertices", J. Chern. Inf. Comput. Sci., in press.
Hosoya, H. and Ohkami, N. (1983) "Operator Technique for Obtaining the
Recursion Formulas of Characteristic and Matching Polynomials as Applied
to Polyhex Graphs", J. Comput. Chern. 4, 585-593.
Hosoya, H., Tsukano, Y., Ohuchi, M., and Nakada, K. (1993) "2-
Dimensional Torus Benzenoids Whose Electronic State Rapidly Converges
to Graphite", in M. Doyama, J. Kihara, M. Tanaka, and R. Yamamoto
(eds.), Computer Aided Innovation of New Materials II, Elsevier Sci.
Publ., Amsterdam, pp. 155-158.
HUckel, E. (1931) "Quanten theoretische Beitdige zum Benzolproblem. I",
Z. phys. 70, 204-272.
Kawasaki, K., Mizutani, K. and Hosoya, H. (1971) "Tables of Non-
INTRODUCTION TO GRAPH THEORY 35

Adjacent Numbers, Characterisitc Polynomials and Topological Indices.


II. Mono- and Bicyclic Graphs", Natl. Sci. Rept. Ochanomizu Univ. 22,
181-214.
Kier, L.B. and Hall, L.H. (1976) Molecular Connectivity and Drug
Research, Academic Press, New York.

Kier, L.B. and Hall, L.H. (1986) Molecular Connectivity in Structure-


Activity Analysis, John Wiley, New York.
Knop, J.V., Muller, W.R., Jericevic, Z., and Trinajstic, N. (1981)
"Computer Enumeration and Generation of Trees and Rooted Trees", L.
Chern. Inf. Comput. Sci. 21,91-99.
Knop, J.V., Muller, W.R., Szymanski, K., and Trinajstic, N. (1985)
Computer Generation of Certain Classes of Molecules, SKTH, Zagreb.
Knop, J.V., Muller, W.R., Szymanski, K., Nikolic, S., and Trinajstic, N.
(1990) "Computer-Oriented Molecular Codes", in D.H. Rouvray (ed.),
Computational Chemical Graph Theory, Nova Sci. Publ., New York.
Kuratowski, K. (1930) "Sur Ie probleme des courbes gauches en
topologie.", Fund. Math 15, 271-283.
Liu, C.H. (1968) Introduction to Combinatorial Mathematics, McGraw-
Hill, New York.
Mizutani, K., Kawasaki, K., and Hosoya, H. (1971) "Tables of Non-
Adjacent Numbers, Characterisitc Polynomials and Topological Indices. 1.
Tree Graphs", Natl. Sci. Rept. Ochanomizu Univ. 22, 39-58.
Petersen, J. (1898) "Sur Ie theoreme de Tait ", Intermed Math. 5, 225-227.
P61ya, G. (1936) "Kombinatorische Anzahlbestimmungen fUr Gruppen,
Grafen und chemische Verbindungen", Acta Math. 68, 145-253.
Randic, M. (1975) "On Characterization of Molecular Branching", J. Am.
Chern. Soc. 97, 6609-6615.
Randic, M. (1977) "A Systematic Study of Symmetry Properties of
Graphs. 1. Petersen Graph", Croat. Chern Acta 49, 643-655.
Randic, M., Hosoya, H. and Polansky, O.E. (1989) "On the Construction
of the Matching Polynomial for Unbranched Catacondensed Benzenoids",
J. Comput. Chern. 10, 683-697.
Randic, M. and Wilkins, C.L. (1980) "A Procedure for Characterization of
the Rings of a Molecule", J. Chern. Inf. Comput. Sci. 20, 36-46.
Read, R.C. (1976) in Chemical Application of Graph Theory, Academic
Press, London.
Riordan, J. (1958) An Introduction to Combinatorial Analysis, John Wiley,
36 H.HOSOYA

New York.

Rossini, F.D. et al. (1943-) Selected Values of Properties of Hydro-


carbons and Related Compounds, american Petroleum Institute Research
Project 44, Texas A&M Research Foundation, Texas.

Rouvray, D. (1990) Computational Chemical Graph Theory, Nova Sci.


Publ., New York.

Rouvray, D. (1991) "The Origins of Chemical Graph Theory", in D.


Bonchev and D.H. Rouvray (eds.), Chemical Graph Theory, Abacus
Press/Gordon & Breach Sci. Publ. New York.
Sachs, H. (1964) "Beziehungen zwischen den in einem Graphen
enthaltenen Kreisen und seinem charakteristischen Polynom", Publ. Math.
(Debrecenl 11, 119-134.

Sinanoglu, O. (1975) "Theory of Chemical Reaction Networks. All


Possible Mechanisms/Synthetic Pathways with Given Number of Reaction
Steps or Species", J. Am. Chern. Soc. 97, 2309-2320.

Sinanoglu, O. and Lee., L.-S. (1978) " Finding All Possible a priori
Mechanisms for a Given Type of Overall Reaction", Theor. Chim. Acta
(Berl.) 48, 287-199.

Sinanoglu, O. and Lee., L.-S. (1979) " Finding the Possible Mechanisms
for a Given Type of Overall Reaction", Theor. Chim. Acta (Ber!.) 51, 1-9.

Spiaiter, L. (1964) "The Atom Connectivity Matrix Characteristic


Polynomial (ACMCP) and Its Physicogeometric (Topological) Signifi-
cance", J. Chern. Doc. 4, 269-274.
Tang, A.-C., Kian, Y.-S., Yan, G.-S., and Tai, S.-S. (1986) Graph-
Theoretical Molecular Orbitals, Science Press, Beijing.

Trinajstic, N. (1983) Chemical Graph Theory, 1st Ed., CRC Press, Boca
Raton, FL.

Trinajstic, N. (1992) Chemical Graph Theory, 2nd Ed., CRC Press, Boca
Raton, FL.

Wiener, H. (1947) "Structural Determination of Paraffin Boiling Points",


J. Am. Chern. Soc. 69, 17-20.
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR
ORBITALTHEORY*

NENAD TRINAJSTIC and ZLATKO MIHALIC


The Rugier BoSkovic Institute
P.O.B. 1016, HR·41001 Zagreb, The Republic of Croatia

ANTE GRAOV AC**


LaBRI, The University of Bordeaux I
FR·33405 Talence Cedex, France

"The value o/mathematics to chemistry


requires flO ackflmvledgemeflt."
Jerome Karle (1987)

1. Introduction

Molecular orbital (MO) theory, at various levels of approximation, is nowadays a


standard tool of chemists.1. 2 Similarly, chemical graph theory is also becoming a
powerful device in the hands of chemists.3- s A graph-theoretical analysis of the MO
theory at the level of the Hiickel approximation was carried out by a number of authors,
e.g. 6-10 A pioneering work on the relationship between graph theory and the MO theory
at the PPP level was recently accomplished by Balasubramanian. 11 The complexity of
this analysis is much higher than that in the case of graph-theoretical analysis of Hiickel
MO theory. However, this result is very valuable, because it shows that the graph-
theoretical analysis of the MO theory at the higher levels of approximation is also

* Dedicated to the memory of those brave Croatian men. women and children who died defending the
freedom and democracy in the Republic of Croatia against the Serbian and Montenegrin fascists.
** Permanent address: The Rugjer Boskovic Institute. P.O.B. 1016. HR-41001 Zagreb. The Republic of
Croatia.

37
D. Bonchev and O. Mekenyan (eds.!, Graph Theoretical Approaches to Chemical Reactivity, 37-72.
© 1994 Kluwer Academic Publishers.
38 N. TRINAJSTIC ET AL.

possible. Nevertheless, we will consider in the present article only the interplay between
the MO theory at the Hiickellevel and graph theory. In this way the analysis will be
simple, clear and easily understood by a chemical community at large. Besides, the
HMO theory in spite of all of its shortcomings I is still being used by many a chemist,
e.g.,12-21 as a convenient device for qualitative rationalization of a variety of chemical
phenomena.
This article is structured as follows. The next section contains the exposition of the
fundamentals of graph theory later needed in the text. In the third section the equivalence
of graph spectral theory and Hiickel molecular orbital theory is presented. In the fourth
section a brief discussion on the characteristic of the Hiickel spectrum is given. The fifth
section contains the presentation of the topological effect on molecular orbitals (the
TEMO concept), whilst in the sixth section the graph-theoretical formulae for the
HOMO-LUMO separation and absolute hardness are considered. The concept of
topological charge stabilization is detailed in the seventh section. The eigth section
contains the graph-theoretical analysis of the localization energy. The article ends with
concluding remarks.

2. Fundamentals of Graph Theory

In this section will be given only those graph-theoretical concepts and definitions that
will be utilized in the present article. In doing this we will follow Frank Harary's classic
text "Graph Theory"22 and our own books 3,5 on chemical graph theory.
A simple graph G is defined as an ordered pair (V(G), E(G)), where V=V(G) is a
nonempty set of elements called vertices (or points) of G and E=E(G) is a set of
unordered pairs of distinct elements of V called edges (or lines). Whenever we mention
the term graph in the text, we will always refer to only a simple graph.
A graph G can be visualized by means of a diagram when the vertices are drawn as small
circles or dots and the edges as lines or curves joining the appropriate circles. Since a
diagram of a graph fully describes the graph, it is customary to refer to the diagram of
the graph as the graph itself. As an example of a simple graph we give in Figure 1 a
diagram of a labelled simple graph O. A graph G is labelled if a certain numbering of

1kj78
vertices in G is introduced.

3 6
4 5
G
Figure 1 A diagram of a labelled simple graph G.
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 39

Many results which can be proved for simple graphs may be extended without difficulty
to more general graphs in which two vertices may have more than one edge connecting
them (multiple edges) or edges may join vertices to themselves (loops). An example of a
general labelled graph is given in Figure 2.

G
Figure 2 A diagram of a labelled general graph G.

A subgraph G' of a graph G is simply a graph whose all vertices and edges are contained
in G. Subgraphs of G can be generated from G by deleting any number of vertices and/or
edges. By definition a graph is its own subgraph. In Figure 3 we give several subgraphs
ofG.

Figure 3 Several subgraphs of a graph G. G2' and G6' are isomorphic subgraphs.

Note that G1' is an acyclic spanning subgraph of G. Any subgraph of a graph G which
contains all its vertices is a spanning subgraph of G. Subgraph G1' can be denoted as G-e
as it is obtained by deletion of the edge e from G. Frequently occurring subgraphs are G3'
and GS' which are denoted as G-r and G-r-s, respectively. Subgraph G-r is obtained by
deletion of the vertex r and its incident edges from G. Subgraph G-r-s is generated by
removal of vertices rand s and their incident edges from G.
Two vertices rand s of a graph G are adjacent (the first neighbours) if there is an edge
joining them. Vertex s and edge e in G of Figure 3 are incident as the edge e terminates
at s.
A graph G is planar if it can be drawn in the plane in such a way that no two edges
intersect. Otherwise a graph is non-planar. Examples of a planar graph and a nonplanar
graph are given in Figure 4.
40 N. TRINAJSTIC ET AL.

G1 G2
Figure 4 Examples of a planar graph (GI) and a nonplanar graph (G2).

Graph Gl in Figure 4 is also a bipartite graph. A bipartite graph G is a graph whose


vertex-set V can be partitioned into two nonempty subsets V I and V2 such that every
edge in G joins V I and V2. Therefore, the first neighbours of vertices in V 1 are contained
in V2 and reversely. Graph G2 in Figure 4 is a nonbipartite graph. In this case it is not
possible to split the vertex-set V into two subsets V I and V2 under the condition required
by bipartite graphs. A simple way to detect whether a given graph is bipartite or not is to
inspect its cycles. If all of its cycles are even, the graph is bipartite, otherwise, it is non-
bipartite. Graph G2 in Figure 4 has two five-membered cycles and therefore is non-
bipartite.
Chemical systems may be depicted by chemical graphs using a simple conversion rule:

site H vertex
connection Hedge

A special class of chemical graphs are molecular or constitutional graphs in which


vertices correspond to individual atoms and edges to chemical bonds between them. In
order to simplify the handling of molecular graphs, hydrogen-suppressed or skeleton
graphs are often used where molecular skeletons without hydrogen atoms and their
bonds are depicted. The hydrogen-suppressed planar graphs depicting the connectivity of
conjugated centers in a molecule are called Hiickel graphs, because they were first used,
albeit unknowingly, by Hiickel. In Figure 5 we give a molecular graph Gl and a
hydrogen-suppressed graph G2 corresponding to cubane.

H H
H , H ./

CI
" __ C-/-C r'\, /
C~I
V "\

Figure 5 A molecular graph (G 1) and a hydrogen-suppressed graph (G2)


representing cubane.
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 41

A graph G is I-factorable if there exists at least one I-factor of G. A I-factor of a graph


G is a spanning subgraph of G whose components are only K2. The I-factors of a Hiickel
graph G conespond to the Kekule sU'uctures 23 of its related conjugated system. A graph
G is complete if each vertex of G is connected with all other vertices in G. Then, all
vertices in G have necessarily the same degree. A degree of the vertex is equal to the
number of its incident edges. A graph in which every vertex has the same degree is
called a regular graph. The edge with its two terminal vertices, K2, is a complete graph
of degree one.
The vertex adjacencies in a graph G can be encoded by the (vertex-) adjacency matrix.
The adjacency matrix A = A(G) of a labelled graph G with N veltices is the square NxN
symmetric matrix defined as

A) = { 1 if veltices rand s are adjacent


(1)
(rs 0 otherwise

As an example, we give below the adjacency matrix A conesponding to a labelled graph


G given in Figure 1.

0 1 0 0 0 0 0 0
1 0 I 0 0 0 1 0
0 1 0 1 0 0 0 0
A 0 0 1 0 1 0 0 0
(2)
0 0 0 1 0 1 0 0
0 0 0 0 1 0 1 0
0 1 0 0 0 1 0 1
0 0 0 0 0 0 0

Typically, the adjacency matrix is a sparse matrix: amongst its N2 matrix elements there
are 2 lEI nonzero entries where lEI denotes the number of edges in G.
The characteristic (secular, spectral) polynomial P(G;x) of a graph G is the characteristic
polynomial of its adjacency matrix, A = A(G):

P(G;x) = det Ix I - AI (3)

where the I is the NxN unit mattix. The characteristic polynomial can be wlitten as

N
P(G;x) = Lan xN -n (4)
n=O

The dependence of the coefficients an on the structure of G is well understood.3,5,24


42 N. TRINAJSTIC ET AL.

A graph eigenvalue Xi (i=l, ... , N) is a zero root of its characteristic polynomial. The
collection of all graph eigenvalues {Xl, ... , XN} forms the spectrum of the graph.24 The
eigenvalues are real and the interval they span is bounded. According to the Frobenius
theorem,25 the limits of the graph spectrum are determined by the maximum degree
(valency) of vertices in a graph.
There are many methods available for the construction of the characteristic polynomial,
e.g. 26 We usually use the Le Verrier-Faddeev-Frame method. 27 .28

3. Isomorphism of Graph Spectral Theory and Hiickel Molecular Orbital Theory

The simplest fOlm of the MO theory is the Hiickel molecular orbital (HMO) theory.29
The theoretical framework of the HMO model has been very often presented in the
literature. 8.30.3 I Therefore we shall repeat it here only bliefly. In Hiickel theory, only the
7t-electrons are considered explicitly. This is the result of the Hiickel assumption that a
and 7t electrons in a conjugated system are separated, i.e., that the related wavefunctions
are mutually orthogonal:

(5)

The individual 7t MOs of a conjugated molecule are eigenfunctions (\1') of the


corresponding Hiickel Hamiltonian,

H(Huckel) 'l'i = Ei 'l'i (i = I, ... , N) (6)

H(Huckel) is the effective one-electron Hamiltonian. It is defined only to the extent that
we give its matrix elements (see eqs. (14) - (15». The quantity Ei is the energy
eigenvalue associated with 'l'i. The 7t MOs are expressed in the usual linear combination
of atomic orbitals (LCAO) form,

(7)

where Cir is the linear expansion coefficient and 'i>r is a 2pz orbital on atom r. The
summation is over all conjugated centers in a molecule. The functions 'l'i are
orthonormal, that is,

(8)

If the occupation number of'l'i is denoted by ni, the total 7t-electron energy En of such
electronic configuration is given by,
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 43

I. cir * Hrs Cis


rs
En = I. ni Ei = I. nj -=----- (9)
I. Cir* Srs Cis
rs
where

Hrs = < <Pr I "(Hiicke!) I <Ps > (10)


Srs = < <Pr I <Ps > (11)

The coefficients Cjr are obtained from the requirement that En should be a minimum.
what by means of the standard variational procedure leads to a set of secular equations.

N
I. Cis (H rs - Ej Srs) = 0 (i.r = 1•...• N) (12)
i:<;;l

where N denotes the number of the conjugated centers in a molecule.


If this set of secular equations is to have nontrivial solutions. the corresponding secular
determinant must vanish,

(i,r=l, ... ,N) (13)

The zeros of (13) determines the 1t MO energies: EI, E2, ... , Ei, ... , EN. For a given Ei. the
set of eqs. (12) determines the coefficients Ci'S of the i-th MO.
The secular determinant (13) can be simplified by using the set of approximations
originally introduced by Bloch32 and utilized by Hiicke1. 29 Since the Hiickel
Hamiltonian is not known explicitly, its matrix elements. eqs. (10) and (11), can be
related to empirical quantities. The diagonal elements (Hrr) are assumed to be constant
for all identical orbitals; they are called Coulomb integrals and are given an empirical
value of <x,

Hrr = < <Pr I "(Huckel) I <Pr> = <X (14)

The off-diagonal elements (Hrs) are assured to be zero unless orbitals <Pr and <Ps are
located on bonded atoms. For bonded atoms H rs are assumed to be the same for all
similar bonds. They are called resonance integrals and are given an empirical value of ~.

H rs = < <Pr I "(Hiickel) I <Ps > = {~ if ato~s rand s are bonded (15)
o otherwIse
Furthermore, the zero overlap is assumed between neighbouring atoms. that is,
44 N. TRINAJSTIC ET AL.

Srs = < CPr I CPs > = Bes (16)

This is the so-called zero-overlap approximation, which, although its looks


oversimplified, is justified empirically through the applications of the Hi.ickel theory over
the past 60 years.
Eq. (13) can also be given in the matrix form,

det I H - Ej S I = 0 (i = 1, ... , N) (17)

where H is the Hamiltonian matrix and S is the overlap matrix. As a result of the Bloch-
Hi.ickel approximations, the matrices Hand S have the following composition,33

(18)
(19)

where A is the adjacency matrix of the Hiickel graph. The matrix [H - Ej S] is called the
Huckel matrix. Substitution of Hand S by (18) and (19) into (17)' and division of each
row of determinant by ~ gives,

E'-a
det I _1_ I - A I= 0 (i = 1, ... , N) (20)
~

If the normalized form of Huckel theory is used, i.e., if ~ is taken as the energy unit and
a the zero-energy reference point or ~ = 1 and a = 0, then eq. (20) becomes Hi.ickel
determinant,

det I Ej I - A I =0 (i = 1, ... , N) (21)

The comparison between the Huckel determinant and the secular determinant (3):

(i = 1, ... , N) (22)

reveals that Ej, representing the energies of individual Hiickel MOs, are identical to the
elements of the spectrum of the adjacency matrix of a Huckel graph,

Ej=xj (i = 1, ... , N) (23)

Since matrices H and A commute (this can be easily proved),33,34

[H, A] =0 (24)
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 45

they possess the same set of eigenvectors. Therefore, the eigenvectors of the adjacency
matrix are identical to the Hiickel MOs. On account of this the HMOs are sometimes
refen-ed to as the topological molecular orbitals.34
Eq. (18) also reveals that the Hiickel Hamiltonian is the linear function of the adjacency
matrix ,7

H =H(A) (25)

This is due to the particular nature of the Hi.ickel Hamiltonian, with the short-range
forces being dominant in the effective potential.3 4
The analysis in this section results in two important conclusions: (i) The spacing and
general pattern of Hi.ickel eigenvalues are specified by the skeletal atom-atom
connectivity in the conjugated molecule and (ii) The skeletal atom-atom connectivity in
the conjugated molecule, rather than its geometry, determines the form of Hi.ickel
molecular orbitals. Therefore, what chemists customarily call as the Hi.ickel MO theory
is essentially what graph-theoreticians refer to as graph-spectral theory. In fact Hi.icke!
theory and graph-spectral theory are isomorphic theories for the specified class of graphs
(planar graphs with the maximum valency 3).3-9 ,24.35

4. Hiickel Spectrum

The Hi.ickel spectrum (the collection of Hi.ickel eigenvalues) is given by an ordered


sequence of eigenvalues of the Hi.ickel Hamiltonian matrix,

Xl~ ... ~Xn~Xn+l~ ... ~XN (26)

The extrema of the spectrum are defined as we already said by the Frobenius theorem.
Since the maximum degree in Hi.ickel graphs is equal to 3, the interval in which the
Hi.ickel spectrum lies is given by

-3 ~ Xj ~ + 3 (i = 1, ... , N) (27)

Hi.ickel graphs with spectra containing only integers are very rare. Actually there are
only five conjugated sustems with integer Hi.ickel spectra.3 6
A more common case is the OCCUlTence of nonisomorphic conjugated molecules with
identical Hi.ickel spectra. 37 ,38 They are named isospectral molecules 7 in chemical graph
theory, although a telm cospectral is suggested by Harary22 as more appropriate.
The Hi.ickel spectrum consists of three subsets con-esponding to bonding (Xj>O, i.e.,
Ej<O), nonbonding (Xj=O, i.e., Ej=O) and anti bonding (Xj<O, i.e., Ej>O) energy levels. The
cardinalities of these subsets are denoted by N+' No and N., respectively, and they are
related to the number of conjugated centers (N) as:
46 N. TRINAJSTIC ET AL

(28)

These quantities are important for the chemical behaviour of conjugated molecules. The
presence of non bonding energy levels (and non bonding MOs) indicate that such a
molecule should have open-shell ground state (within the HMO model) and be very
reactive.39 The experimental facr4 0 is that the structures possessing nonbonding MOs are
rarely encounteI:ed in the chemistry of conjugated molecules.
For our purpose, related to the scope of this book, especially important eigenvalues are
Xn and Xn+l (where n=N/2 if N is even and n=(N + 1)/2 if N is odd number) which
correspond to the frontier orbitals, HOMO (highest occupied molecular orbital) and
LUMO (lowest unoccupied molecular orbital), respectively. They are chemically the
most important orbitals,41-46 since they are directly involved in chemical reactions. The
following rule appears to be generally valid: The smaller the HOMO - LUMO separation
the more reactive the molecule is expected to be. Additionally, many molecular
properties such as the UV -vis specu'al characteristics, polarographic half-wave oxidation
and reduction potentials, ionization potentials, electron affinities, the charge-transfer
energy in molecular complexes. etc .. are largely dependent on the frontier orbitals and
their energy separation.

5. Topological Effect on Molecular Orbitals

Important advance in the graph-theoretical analysis of the molecular orbital theory


represents the discovery of the principle named the topological effect 011 molecular
orbitals (the TEMO principle).47.50 This principle originated from the comparison of
topological spaces51 corresponding to two isomers with different constitutions which are
topologically related in such a way that their respective topological spaces can be divided
into two or more subspaces that are pairwise isomorphic. 52 Topologically related
isomers are named topomers and in the TEMO pairs of topomers (denoted by Sand T)
are considered which are obtained by two ways of connecting two given subunits. Some
examples of Sand T topomers of benzenoids are given in Figure 6. Note that some
topomers are made up from two equal subunits, whilst others from two unequal subunits.
It should be noted that the connection of two subunits by one bond cannot generate a pair
of topomers. Therefore, the valencies of subunits and the number of connecting bonds,
must be at least two. But even two bivalent subunits cannot produce a pair of topomers
unless the connection sites in the subunits are nonequivalent. The way of construction of
a pair of Sand T topomers obtained by connecting two, generally different, bivalent
subunits A and B, is illusu'ated in Figure 7.
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 47

Figure 6 Examples of topomeric benzenoids. The broken lines indicate bonds which
need to be ruptured in order to transform one topomer into the other.

A B A B

o-+---1-ou o-+---1-ou
5 O-+---1r-<> v So-+---1r-<> v

s T
Figure 7 A schematized pair of topomers.

The characteristic polynomials of the topomers can be expressed in terms of the


characteristic polynomials of their subunits following the ideas of Heilbronner. S3 Thus,
the characteristic polynomials of topomers Sand T from Figure 7 are given by,
48 N. TRINAJSTIC ET AL.

P(S;x) = P(A;x) P(B;x) - P(A-r;x) P(B-u;x)


- P(A-s;x) P(B-v;x) + P(A-r-s;x) P(B-u-v;x)
- 2[I. P(A-Prs;x)] [I. P(B-puv;x)] (29)

P(T;x) = P(A;x) P(B;x) - P(A-r;x) P(B-v;x)


- P(A-s;x) P(B-u;x) + P(A-r-s;x) P(B-u-v;x)
- 2[I. P(A-Prs;x)] [I. P(B-puv;x)] (30)

where PI'S and Puv denote paths connecting vertices rand s in A, and u and v in B,
respectively. The difference .-1(x) of polynomials (29) and (30) is given by,

.-1(x) = P(T;x) - P(S;x)


= [P(A-r;x) - P(A-s;x)] [P(B-u;x) - P(B-v;x)] (31)

In the special case when the subunits A and B are isomorphic and the sites u and v
conicide with the sites rand s, as it is the case for the pair S 1 and T I in Figure 6, (31)
reduces to

.-1(x )= [P(A-r;x) - P(A-s;x)]2 ~ 0 (32)

Obviously, in this specific case .-1(x) is non-negative for all x and the following
inequality holds

P(T;x) ~ P(S;x); X E (-00, +00) (33)

There are several consequences of (33) for those specially consu'ucted Sand T isomers.
We present two of them. Firstly, the total 1t-electron ground state energies, E(S) and
E(T), of those isomers obey,54

E(S) ~ E(T) (34)

with equality if and only if

P(A-r;x) = P(A-s;x) (35)

The second consequence of (33) is an important interlacing theorem. 4 It may be stated


as: The zeros XiS and XiT (i=l,2 ... , N) of characteristic polynomials P(S;x) and P(T;x) of
those specially constructed Sand T isomers are interlaced as follows,

(36)

This theorem is illustrated is Figure 8.


THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 49

x
S T

2.43 2.41

1.95 2.00

1.51
1.41,
1.31 1.41
1.14
1.00,
0.77 1.00
0.61
0.41

-0.41
-0.61
-0.77
-1.00,
-1.14 -1.00
-1.41 -1.41,
-1.51 -1.41

-1.95 -2.00

-2.43 -2.41

-x
Figure 8 The eigenvalue pattern of phenanthrene (S) and anthracene (T).

Besides the Huckel molecular orbital theory, the TEMO principle has been tested against
more sophisticated MO theories and/or experimental data. For example, it has been
tested and confirmed in a series of ab initio calculations at SCF-HF level. 55 -58 Similarly,
the examination of hundreds of experimental data on topomeric pairs also confirmed the
validity of the TEMO principle. SO Cases of violations are sometimes noticed when
strong steric effects and/or pronounced non-uniformity of heteroatoms are present in
topomers.
For our purpose in the present alticle it is relevant how the TEMO principle affects the
HOMO-LUMO separation. The following has been found: 49 In the case of specially
constructed topomeric pairs of alternant hydrocarbons (AHs) with N=4n+2 (n;:::1) It-
electrons the HOMO-LUMO separation is larger in S than in T. If the topomeric pairs
50 N. TRINAJSTIC ET AL.

contain N=4n (n~l) n-electrons then the opposite is true, i.e., the HOMO-LUMO
separation is larger in T than in S.
For example, the comparison of the well-known topomeric pair with 4n+2 n-electrons:
phenanthrene (S) and anthracene (T) leads to prediction that the HOMO-LUMO
separation should be larger in phenanthrene and consequently that this molecule should
be more stable than the related isomer. HMO calculations confirms the TEMO prediction
(HOMO-LUMO(S): 1.21~ vs HOMO-LUMO(T): O.82~) and the experimental evidence
also points to phenanthrene as the compound much more stable of the two. 59
The TEMO predictions concerning HOMO-LUMO separation are also supported by the
experimental energies of the p-bands, i.e., bands arising from the n-electron jump from
the HOMO onto the LUMO level. This can be illustrated by considering a topomeric pair
with 4n n-electrons: dibenzo[fg,ij]pentaphene (S) and dibenzo[fg,qr]pentacene (see
Figure 9).

s T
Figure 9 A pair of topomelic alternants with 4n n-electrons.

The prediction that the HOMO-LUMO separation should be larger for T is fully
supported by the experimental energies of the p-bands in their absorption spectra: pS =
2.98eV vs pT = 3.0IeV.60
The TEMO principle also affects the ordering of the ionization potentials and other
physical and chemical properties in Sand T isomers.

6. The HOMO-LUMO Separation

The importance of the frontier orbitals has already been noticed by Hi.ickel in his study of
the alkaline reduction of naphthalene and anthracene. 61 Other theoreticians who also
early observed the significance of the frontier orbitals for the outcome of the chemical
reaction were Moffitt62 and Walsh. 63 .64 However, the systematic and detailed study of
the role of the frontier orbitals and the HOMO-LUMO separation in the theory of
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 51

chemical reactivity was canied out by Fukui. 41 -43 ,45,46,65,66 These concepts have also
been incorporated into the PMO theory of Dewar67 and the Woodward-Hoffmann
rules. 68 In this section we will give approximate formulae for the estimation of the
HOMO-LUMO separation based on the graph-theoretical quantities. The HOMO-LUMO
separation, denoted by 0, is given by,

0= Xn - Xn+l (37)

In the case of altern ant structures, because of spectral symmetry,69 i.e.,

xn+l = - Xn (38)

relation (37) becomes,

0= 2xn (39)

In the past, several researchers have been interested in studying the structural factors
which influence the HOMO-LUMO separation.7 0-75 They have established that the
structural characteristics of the conjugated molecule, such as branching and cyclicity,
influence the HOMO-LUMO separation, though in a very complicated way.7 6 Later
efforts have produced approximate formulae for the HOMO-LUMO separation.7 7,78
Gutman and Rouvray77 derived the first approximate formula for the HOMO-LUMO
separation in alternant hydrocarbons (AHs). This formula is given below,

(40)

where N is the number of sites in a graph G depicting AH, whilst the aN-2 and aN are the
last two coefficients of the characteristic polynomial of G. The application of formula
(40) is illustrated in Figure 10.

(1)

(2)
Naphtalene graph
co
The characteristic polynomial of G
G

P(G;x) = x lO - Ilx8 + 41x 6 - 65 x4 + 43x 2 - 9


(3) The use offOlmula (40)
N=lO
laNI = 9
IaN-II = 43
o= 2( 13 )1/2 [ 1 + i (1 - 120)5 ] 1.065 ~
o(exact) = 1.236 ~
Figure 10 Application of formula (40) to naphthalene.
52 N. TRINAJSTIC ET AL.

It could be shown77 that the 0 values calculated by the Gutman-Rouvray-Graovac


formula underestimate the correct Huckel valuesJ9 However, in spite of its moderate
success, the Gutman-Rouvray-Graovac formula represents the first significant advance in
this rather difficult area where only very limited results have previously been achieved,
e.gJ5
The Gutman-Rouvray-Graovac formula was soon replaced by the Graovac-Gutman
formula,78 which appears to be much more accurate. The Graovac-Gutman formula is as
follows,

(41)

The symbols in (41) have their previous meaning. The use of this formula for predicti ng
the HOMO-LUMO separation in naphthalene is demonstrated below (where the data are
taken from Figure lO):

3·10-29
o(naphthalene)= -1-0- (43)112= l.281~ (42)

This time the agreement between the computed and exact o-values is much better,
difference being only O.045~. Formula (42) has been tested, for example, on a modest set
of benzenoids and has produced a reasonable agreement with exact HOMO-LUMO
values (see Table 1 and Figure 11).

2.5
o
::2:

o
~
--l HOMO- LUMO a Ii + b
2.0
::2:
o
I

1.5
• n = 24
s = 0.118
R = 0.926
1.0 F=132.7
a = 1.228 ± 0.107
b = -0.149 ± 0.101

0.5

0.0 + - - - - r - - - - - - r - - - T " ' " " - - - - - r - - - - - r - - - - - - ,
0.50 0.75 1.00 1.25 1.50 1.75 2.00

8
Figure 11 A plot of Huckel vs estimated HOMO-LUMO gap (in ~ units).
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 53

Table 1 The Hiickel and estimated HOMO-LUMO separation in benzenoids

HOMO-LUMO separation (in ~ units)


Benzenoid Hiickel a Estimatedb Difference
benzene 2.00 1.78 0.22
naphthalene 1.24 1.28 -0.04
anthracene 0.83 0.94 -0.11
phenanthrene 1.21 1.11 0.10
tetracene 0.59 0.71 -0.12
benzo [a] anthracene 0.90 0.90 0.06
benzolc]phenanthrene 1.14 0.97 0.17
chrysene 1.04 0.96 0.08
triphenylene 1.37 1.02 0.35
pyrene 0.89 0.95 -0.06
pentacene 0.44 0.55 -0.11
benzol a] tetracene 0.65 0.72 -0.07
dibenzo[a,j]anthracene 0.98 0.850 0.13
dibenzo[a,h]anthracene 0.95 0.848 0.102
benzo[b]chrysene 0.81 0.80 0.01
benzo[g]chrysene 1.06 0.89 0.17
dibenzo[a,c ]anthracene 0.998 0.871 0.127
pentahelicene 1.07 0.879 0.191
benzo[c]chrysene 1.10 0.876 0.224
picene 1.003 0.871 0.132
dibenzo[b,g]phenanthrene 0.84 0.81 0.03
benzol e]pyrene 0.994 0.91 0.084
perylene 0.69 0.79 -0.1
benzol a]pyrene 0.74 0.80 -0.06
a Ref. 79
b Formula (41)

However, regardless the limited success of these two forrnulae 77 ,78 it is easily seen that
Hall's pessimistic prediction in 1977,76 that the exact analytical expression for the
HOMO-LUMO separation will probably be very difficult to obtain, remains
unchallenged.
Although both formulae are rather approximate, they allow several interesting
inferences. For example, the following relationships for alternants containing only 4n+2
(n21) cycles hold,80

(laNI) = [K(G)]2 (43)


(laN-21) =L [K(G-r-s)]2 (44)
r,s
54 N. TRINAJSTIC ET AL.

where K(G) is the Kekule-structure count of G, whilst K(G-r-s) is the Kekule-structure


count of the subgraph G-r-s obtained by deletion of the vertices rand s from G.23 For
example, subgraphs G-r-s of the naphthalene graph with their Kekule-structure counts

co
are given in Figure 12.

2 (4) 0(4) 2 (2) 0(4) 1 (2)

C() C()
0(4) 1 (4)
CO 00 CO
o (2) 1 (4) 1 (2)

cococoO)co
1 (4) 1 (2) 0(2) 0(4) 1 (1)

Figure 12 Subgraphs G-r-s of naphthalene graph G and the corresponding Kekule-


structure counts. The numbers below each nonisomorphic subgraph G-r-s are
its K(G-r-s) and (in brackets) number of subgraphs isomorphic with G-r-s.

By introducing (43) and (44) into (41), we obtain,

8 = 3N-2 { [KG]2 } 1/2 (45)


N L,[KG-r-s]2
T.S

In the case of AHs containing 4n+2 and/or 4n (n~1) cycles, the Kekule-structure count in
(45) should be replaced by the algebraic-structure count (ASC),81.82

8 = 3N-2 { [ASC(G)]2 } 1/2 (46)


N I [ASC(G-r-s)]2
T.S
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 55

There are a number of methods available for counting K(G) and K(G-r-s) or ASC(G) and
ASC(G-r-s) numbers of G.23,83-86 However, formulae (45) and (46) are interesting per
se because they reveal the relationship (although only approximate) between the MO
concept: (the HOMO-LUMO separation) and the VB concept: (the Kekule-structure
count or the algebraic-structure count.) Or in other words, they disclose that the HOMO-
LUMO gap in AHs is also calculable by use of the VB quantities.
The relationship between the HOMO-LUMO separation and graph-theoretical quantities
allows also similar analysis of the concept of absolute hardness. The absolute hardness T\
of a molecule is defined as,87

(47)

where E is the electronic energy of the molecule, N is the number of electrons in the
molecule and v is the external potential due to the nuclei. A finite approximation to eq.
(47), within the validity of Koopman's theorem, is given by,

(48)

where the symbols I and A stand for the ionization potential and electron affinity,
respectively. Note that fOimula (48) is independent of any molecular model. If the MO
theory is used,88 the absolute hardness can be defined in terms of the frontier orbitals,

(49)

Since, for alternants,

ELUMO = - EHOMO (50)

eq. (49) reduces to,

T\ = - EHOMO (51)

EHOMO and ELUMO may be computed by means of any MO model such as the Hartree-
Fock or Hiickel MO model.
The absolute hardness has been used by PalT et al. as a measure of aromaticity.89-91
However, because of its definition in terms of the frontier orbitals, the absolute hardness
is a theoretical quantity which may be regarded as an unifying criterion for both the
aromatic (thermodynamic) stability and reactivity (kinetic stability) of a molecule. In this
sense the harder the polycyclic molecule, the more aromatic and less reactive it is.
56 N. TRINAJSTIC ET AL.

Pearson's statement92 : "There seems to be a rule of nature that molecules arrange


themselves so as to be as hard as possible. A large HOMO-LUMO gap increases
stability", was named by Zhou and PaIT90 the principle of maximum hardness. The
plinciple of maximum hardness appears to be a principle of general validity. 93
The above formulae may also be reformulated in tenns of graph-theoretical quantities.
The absolute hardness for alternants is, thus, approximated by,

(52)

The symbols in eq. (49) have their previous meaning. This fonnulae gives, for example,
for the Hiickel absolute hardness of naphthalene,

11 = 2 - 3·10 (13 )1/2 = 0.640 (-~) (53)


2·10

The exact Hiickel value of 11 for naphthalene is 0.618 (-~).


Eq. (52) can be, of course, rewritten in terms of Kekule-structure counts for AHs
containing 4n+2 (n;:::1) cycles as

11 = 2 - 3N { [K(G)]2 } 1/2
(54)
2N L[K(G-r-s)]2
r,s

or in terms of the algebraic-structure counts for AHs containing 4n+2 and 4n (n;:::l)
cycles as

11 = 2 - 3N { [ASC(G)]2 } 1/2 (55)


2N L[ASC(G-r-s)]2
r.s

7. Topological Charge Stabilization

The concept of topological charge stabilization has been introduced by Gimarc. 94 It is


based on the observation that the pattern of charge densities in the homonuclear system,
called the uniform reference frame, is detelmined by (i) the connectivity of atoms within
the molecule and (ii) the number of electrons that occupy the molecular orbital system.
The idea that the charge distribution over the molecular framework might be determined
by its internal connectivity goes back at least to the fifties. 95
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 57

The rule of topological charge stabilization states that the heteroatoms prefer to be
placed at those positions where their electronegativities match the charge distribution as
determined by the uniform reference frame. Charge densities may be computed by
Hiickel theory, extended Hiickel theory or ab initio MO theory. This rule has been
applied successfully to a number of inorganic and organic, planar and non-planar
systems. 94,96-1 04
The validity of rule of topological charge stabilization goes beyond Hiickel theory and
indeed even the molecular orbital approximation, as the following argument based on the
first-order perturbation theory shows. 97 (This argument was originally given by Parr by
private communication to Gimarc.) Consider the uniform reference frame as the
unperturbed system with Hamiltonian HO, wavefunction 'Po and total energy EO
connected by the SchrOdinger wave equation,

(56)

If we introduce a heteratom in the system keeping the molecular structure and the
number of electrons fixed, it would represent a perturbation. The perturbational
Hamiltonian H' may be expressed as a sum of changes in the coulombic nuclear-electron
attraction terms due to alternations in nuclear charges !:{Za which result from substitution
of a heteroatom at position a,

H' = -L !:{Za /ria (57)


a,i

where a and i are used as labels for the nuclei and the electrons, respectively. For the
pelturbed system described by,

H=HO+H' (58)

the total energy E can be computed as the sum of the unperturbed (zero-order energy)
and the higher order corrections,

E = E(o) + E(l) + E(2) + .... (59)

The first-order perturbation correction is given as,

E(l) = < 'Po IH'I 'PO > (60)

Since the operator H' involves only multiplication, the 'Po factors can be joined together
within the integral to give the unperturbed electron density,
58 N. TRINAJSTIC ET AL.

(61)

Then,

(62)

Therefore, to achieve maximum stability (lowering of the energy) through the correction
E(l), the heteroatoms with largest llZ should tally those positions in the molecule where
the electron density pO is already largest in the unperturbed or reference frame. For
qualitative considerations it is convenient to take into account valence electrons only and
to replace llZ by changes in effective nuclear charge ll~, or even more simply, to employ
electronegativity as a rough measure of ll~.
As we already said the rule of topological charge stabilization has been applied to a wide
selection of systems from both inorganic and organic chemistry. Here we will give only
a few illustrative examples. The reader who is interested in more examples is referred to
the papers on the subject by Gimarc and co-workers. 94,96-103

7.1 PLANAR CONJUGATED SYSTEMS

Let us consider the series of isomeric thiophthenes (shown in Figure 13) which are
isoelectronic with the pentalene dianion, i.e., the uniform reference frame with MO 1t-
electrons in the eight-orbital system. In Huckel MO theory the charge density qr as atom
r is given by,

(63)

where Cir is the coefficient of atomic orbital r in the molecular orbital «I>i and ni is the
number of electrons (2, 1 or 0) in orbital «I>i. The charge density distribution of pentalene
dianion is also given in Figure 13. Incidentally, the pentalene dianion has been

ro
prepared. lOS
1.20 11.32
5 7 2 1.17
4 8 3

<X>
1

CO <X 2 3
s (X) 4 5
Figure 13 1t-electron charge distribution in the pentalene dianion and diagrams of
four isomeric thiophthenes.
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 59

In the series of isomeric thiophthenes 2-5 using the rule of topological charge
stabilization with reference to lone concludes with regard to their relative stabilities that
those of 1,4-thiophthene (2) and 1,6-thiophthene (4) should be comparable, but 1,5-
thiophthene (3) and 2,5-thiophthene (5) should be successively less stable. This result
agrees with experimental facts. All four isomeric thiophthenes are known,106-109
although 2,5-thiophthene only as tetraphenyl-substituted derivative. I09 Calculated
resonance energies by various theoretical models (see Table 2) support completely the
stability predictions based on the rule of topological charge density.

Table 2 Calculated resonance energies of isomeric thiophenes

Molecule DRE a REPE b TRE c


(kcal/mol) (~) (~)

1,4-thiophthene 11.3 0.022 0.031


1,5-thiophthene 5.9 0.015 0.026
1,6-thiophthene 10.5 0.024 0.031
2,5-thiophthene -33.9 0.004

a MJ.S. Dewar and N. Trinajstic, J. Amer. Chem. Soc. 92, 1453 (1970).
b B.A. Hess, Jr., LJ. Schaad and C.W. Holyoke, Jr., Tetrahedron 28, 3657
(1972); Tetrahedron 31, 295 (1975).
C I. Gutman, M. Milun and N. Trinajstic, J. Amer. Chem. Soc. 99, 1692 (1977);
M. Milun and N. Trinajstic, Croat. Chem. Acta 49, 107 (1977).

7.2 NON-PLANAR SYSTEMS

The rule of topological charge stabilization operates also in non-planar systems. The
charge density distribution in the uniform reference frames for non-planar systems is
calculated by means of the extended Hiickel MO theory.1l0 The extended Hiickel MO
theory is known to yield exaggerated charges but they appear to be adequate for the
purpose which require only a qualitative pattern of charge density disu;bution. In some
test cases it has been found lll that the charge iteration of the extended Hiickel MO
theory 1J2 gives charges which are more realistic but with the same pattern as those
obtained by the non-iterative procedure.
The uniform reference frames for non-planar systems are often hypothetical and have
very large total charges Q. The charges qr on individual atoms r must sum to Q,

(64)

Since one is interested only in charge differences, the normalized charges q'r are
introduced,9?
60 N. TRINAJSTIC ET AL.

q'r = qr - Q/N (65)

where N is the number of centers of the uniform reference frame. The normalized
charges q'r also sum to zero.
A beautiful example96 to illustrate the operational power of the rule to topological charge
stabilization is provided by some molecules such as P4S3, AS4S3 and PS3As3 which are
isostructural and isoelectronic with heptaphosphorus-trianion, 113 P7 3-. P7 3- is cage or
end-capped triangular prism, with a unique apical atom, three equivalent bridging atoms
and three based atoms in the equilateral triangle.

The anion P7 3- serves as the uniform reference frame for P4S3, AS4S3 and PS3As3. In
Figure IS are given the Mulliken net atomic popUlations, or more simply, atomic
charges, for P7 3- calculated from extended Hiickel MO wavefunctions. In the same figure
we also give the normalized charges of P7 3-. Since the uniform reference frame is no
longer composed of real atoms, its graph-theoretical representation is given in the figure.

+ 0.170

- 0.720 - 0.291

- 0.194 + 0.235

7
Figure 15 The Mulliken net atomic populations (6) and the normalized
charges (7) of P7 3-.

The normalized charges (see structure 7 in Figure 15) indicate that the bridging positions
are negative compared to apical and basal positions. Consequently, the more
electronegative sulfurs should occupy the bridging sites whilst the less electronegative
phosphorus or arsenic atoms should enter the apex or basal positions. The structure of
P4S3,1l4,1l5 AS4S3116 and PS3As3117 are in agreement with those predicted above (see
Figure 16).
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 61

9
Figure 16 The structures of P4S3 (8), AS4S3 (9) and PS3As3 (10).

7.3 LINEAR AND QUASI-LINEAR SYSTEMS

The pattern of the charge density distribution in a chain vary considerably with its
geometry. Therefore, in the applications of the rule of topological charge stabilization to
linear systems one must consider various geometries of the uniform reference frame.
Let consider symmetric five-atom chain with 24 valence electrons and two of its
geometries: linear and bent. The normalized extended Huckel charge densities for these
two uniform reference frames are given in Figure 17.

~ /-0.88
-0.90 +0.83 +0.15
0-----0----0--- ~0.89
-0.02
11 12
Figure 17 The normalized extended Huckel charge densities for the 24-electron
five-atom uniform reference, linear (11) and bent (12), frames.

Linear reference frame has negative charges at the both ends of the chain. Such a charge
distribution directs the electronegtive atoms to terminal sites. The reader should note that
the change from the linear to bent reference frame increases the electron density at the
central position, thus, providing increased stability for more electronegative atoms. In
agreement with the above C302 (O=C=C=C=O) is almost linear with the electronegative
oxygen atoms at the ends of the chain,1lS whilst B203 and (CN)zS are bent (V-shaped)
molecules1l9.120 (see Figure 18).

- 132 0
- 100 0
.. ;.~ ..
:o~ ~o: :N,;.~,N:
B, /B C, /C
.0. .S.

13 14
Figure 18 The shapes of boron oxide B203 (13) and sulfur dicyanide (CN)zS (14).
62 N. TRINAJSTIC ET AL.

7.4 UNIFORM REFERENCE FRAMES WITH UNIFORM CHARGE DENSITIES

In many cases all positions in the unifonn reference frame are equivalent. Then because
of symmetry all these positions will have the same charge density and, hence, the rule of
topological charge stabilization is not applicable. For example, 1t charge densities in
alternant hydrocarbons are all equal to unity.8
This difficulty may be overcomed by introducing a heteroatom of larger (smaller)
electronegativity than carbon into one of the positions. So, for example, in the case of
benzene as a unifonn reference frame, the introduction of a nitrogen atom gives negative
charges at positions 3 and 5 in pyridine. This is shown in Figure 19.

50
4

3
6 2

o
1 0.946
1.072
()
0.895
N
1.249
Figure 19 1tcharges in benzene and pyridine. The HUckel parameters used for the
nitrogen atom and CN bonds are taken to be ON = 1.00 and ~CN = 1.50.

The structures 15 - 17 (see Figure 20) are in agreement with the pattern of charge
distribution in pyridine.

H
I
H .. B .. H
'N"" 'N""
I I
B .. B
H"" 'N"" 'H
I
H
16 17
Figure 20 Molecules isoelectronic with benzene and pyridine.

However, the rule of topological charge stabilization suggests a decrease in stability on


going from 15 to 17, following the trend of increasing localization of charge on more
electronegative atoms. The electronegativity order (based on either the Pauling or
Mulliken scales) of atoms that appear in the skeletons of molecules shown in Figure 20
is as follows: B<C<N<O.
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 63

To conclude this section we point out that the rule of topological charge stabilization is
easy to apply and could be used to guide preparative efforts and to expose problems that
are worth further study by both experiment and theory. This rule can serve as a powerful
unifying principle for the organization of chemical information. The impact of the rule,
especially in the field of inorganic chemistry, could be significant.

8. Localization Energy

The reactivity index, called the localization energy,121 has been introduced by
Wheland 122 for predicting aromatic substitution reactions. Wheland postulated that the
transition state (Wheland intermediate state)123 in an aromatic substitution reaction
resembles to a cr-complex. The cr-complex is not the transition state for the reaction, but
it is commonly assumed to be close to the transition state on the potential-energy surface.
The answer on the question: How close? depends upon the type and conditions of a
reaction. The structure of the Wheland intermediate is depicted in Figure 21.

Figure 21 The structure of the Wheland intermediate state. X is the attacking atom or
group, whilst S is the part of the original conjugated molecule which is still
able to support 1t-electrons and is referred to as the residual molecule.1 23

The Wheland intermediate mayor may not have a well-defined structure. 124 It is
generally considered as a loose addition complex in which the attacking atom or group X
and the departing hydrogen atom H are on opposite sides of the molecular plane. In this
complex the attacked carbon atom r is in an approximately tetrahedral configuration and
no longer contributes to conjugation within the aromatic ring. Hence, the residual
molecule S has one conjugated center less than the original molecule. Consequently, the
extent of the molecular network available to the 1t-electrons is smaller in the residual
molecule than in the parent molecule.
The localization energy for aromatic substitution, AE1t,

AE1t = En(molecule) - E1t(S) (66)

is the energy needed to form the Wheland intermediate state. The lower this energy, the
lower will be the energy of the transition state, and the aromatic substitution reaction will
64 N. TRINAJSTIC ET AL.

proceed with a smaller energy loss. Therefore, the lower the flE rr value, the lower the
balTier for substitution at a given position. The most prefelTed position of attack,
amongst the available positions for substitution, would be the one in which the
localization energy is the lowest. Err (molecule) and Err(S) may be calculated using MO
theory at various levels of approximation. Ordinarily, Hiickel theory or SCF n-MO
theories are used.
There are three kinds of localization energy possible, depending upon whether the
attacking atom or group is neutral, or positively or negatively charged. Or in other
words, during the substitution reaction either two electrons (electrophilic substitution),
one electron (radical substitution) or no electron (nucleophilic substitution) is localized
on atom r. In the literature the cOlTesponding localization energies are usually denoted 124
by Lr+, Lr' and Lr-, respectively.
It should be noted that the approach based on the localization energy is rather qualitative
and should be cautiously used. Nevertheless, it is still occasionally used,125 because the
users, who are usually experimental chemists with little time at their disposal for exact
theoretical computations, need a method, even if the method is approximate, as an aid in
the planning and interpreting expeliments.
In the framework of graph theory the structure of the Wheland intermediate may be
depicted as a subgraph G' which is obtained by deletion of the appropriate vertex and
incident edges from the Hiickel graph G representing a given conjugated system. In
Figure 22 we give graphs cOlTesponding to the transition states for the substitution upon
the tetracene substratum.

a-graph ~-graph peri-graph


Figure 22 Graphs depicting residual molecules cOlTesponding to transition states
involving tetracene. Possible substitution positions are denoted by a, ~
and peri.

Formula (66) may be rewritten in terms of the graph-theoretical quantities as,

Lr = Err(G) - Err(G-r) (67)


THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 65

where G is the molecular graph depicting a system with only 4n + 2 (n~1) cycles and G-r
is a subgraph obtained by deletion of the vertex r and the edges incident to it from G. In
this discussion we will use the following expression for En(G),126

En(G) = a N + b M + C In K(G) (68)

where Nand M are the number of vertices and edges, respectively, in G, whereas K(G) is
the number of Kekule structures of G. Since the structure G-r contains N-l vertices and
M-2 edges, the approximate formula for En ofG-r is given by,127

En(G-r) = a [(N-1)] + b (M-2) + C In [SC(G-r)] (69)

where SC is the structure-count of G-r.


As the structure G-r contains in its spectrum a zero element, the structure count (SC) of
G-r may be computed by means of the following formula,

(SC)2 = ll' IXil (70)

where 0' denotes multiplication over the non-zero elements of the spectrum of G-r.
Substitution of (68) and (69) into (67) produces graph-theoretical formula for the
localization energy,

Lr = 2(a+b) + c In {K(G)/[SC (G-r)]} (71)

Since a, b and c are constants, the localization energy at position r is determined by the
following graph-theoretical quantity,

I _ K(G)
r - SC(G-r) .. (72)

This formula may also be rewritten in terms of the last coefficients (aN(G) and aN-l(G-r))
of the characteristic polynomials of G and G-r,

(73)

where,

aN(G) = [K(G)]2 (74)


aN-l(G) = [SC(G-r)]2 (75)
66 N. TRINAJSTICET AL.

In the case of altern ant hydrocarbons with 4n+2 and/or 4n (n~l) sites, K(G) and SC(G-r)
in the above equations should be replaced by ASC(G) and ASC(G-r), respectively.
The relative reactivity of two positions rand s is resolved from the difference in the
corresponding localization energies,

ALrs = Lr - Ls (76)

Substitution of (71) into (76) and the use of (72) leads to the expression,

ALrs = c In (lr/ls) (77)


Let us use, at this point, the assumption of Dewar et al.,128.l29 that ALrs can be identified
with the difference in the free energy of activation,

krlks = exp (-ALrsfR'D (78)

where kr and ks are the rate constants for an aromatic substitution reaction on atoms r
and s, T is the absolute temperature (in K) and R (8.314510 Jk-1mol- 1) the ideal gas
constant. By substituting (77) into (78) we obtain the following expression,

(79)

where,

p = c/RT (80)

P is a positive constant (for a given temperature) which does not depend on the type of
substitution reaction. The form of (80) strikingly resembles the Hammett equation. 130 To
obtain the Hammett equation it is necessary to make the additional assumption that linear
free energy relationship holds such that different equations reflect different, but constant,
degrees of approach to the fully localized intermediate.
Since,

SC(G-s)
lsllr =SC(G-r) (81)

or,

(82)

it follows that the position with larger SC value (or aN -1 coefficient) will be more
reactive.
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 67

Therefore, the topological rule for predicting the substituent orientation is as follows:
The more reactive position towards aromatic substitution is the one which has a larger
(algebraic) structure count of resonance forms in the corresponding transition state. Thus
in the case of the example in Figure 22 the reactivity order of positions on anthracene is
as follows: peri (SC = 4) > a (SC = 3) > ~ (SC = 1). This order is in accord with
experimental evidence. 60

9. Concluding remarks

In this review we have discussed the interplay between graph theory and molecular
orbital theory in the area of chemical reactivity. After pointing out that the graph spectral
theory and Hiickel molecular orbital theory are isomorphic theories and discussing the
structure of the HUckel spectrum, we presented several selected topics in which the
interplay between the two theories is clearly enriching each other. These topics were the
TEMO principle (which allows amongst the other things the reactivity predictions for a
special class of topomers), the graph-theoretical estimation of the HOMO-LUMO
separation and absolute hardness of AHs, the rule of topological charge stabilization
(which can serve as a unifying principle of chemical systematics) and finally the graph-
theoretical formulation of the localization energy. There are many other topics which are
left out in order to keep the size of the article within the agreed limits. Thus, we could
not include all reactivity indices that have been developed and analyzed in terms of
graph-theoretical invariants. 131-133 Similarly, we could not consider some interesting
classes of molecules such as Mobius molecules, 134,1 35 fractal benzenoids 136.137 and
fullerenes. 138 However, we are confident that our goal was achieved. that is, to show
how the interplay between graph theory and molecular orbital theory shapes up a
theoretical framework which is fairly simple to be used by experimental chemists and
which is reliable to great extent for qualitative predictions .. We hope there is still some
room left for simple qualitative models, because in chemistry, due to the complexity of
its problems and enormous combinatorial possibilities, we need besides exact and semi-
exact computational models, simple concepts which may be used to classify an observed
or a computed result into a coherent system.

Acknowledgements

This work was supported by the Ministry of Science, Technology and Informatics of the
Republic of Croatia via Grants 1-07-159 and 1-07-185. One of us (A.G.) would like to
thank Professors X.G. Viennot and M. Delest (Bordeaux) for their support and
hospitality during his stay at the LaBRI.
We thank Dr Sonja Nikolic (Zagreb) for useful comments.
68 N. TRINAJSTIC ET AL.

References

1. Dewar, MJ.S. (1969) The Molecular Orbital Theory of Organic Chemistry,


McGraw-Hill, New York.
2. Borden, W.T. (1975) Modem Molecular Orbital Theory for Organic Chemists,
Prentice-Hall, Englewood Cliffs, New Jersey.
3. Trinajstic, N. (1983) Chemical Graph Theory, CRC Press, Boca Raton, FL, Vol.
I, Vol. II.
4. Gutman, I. and Polansky, O.E. (1986) Mathematical Concepts in Organic Chem-
istry, Springer, Berlin.
5. Trinajstic, N. (1992) Chemical Graph Theory, 2nd revised ed., CRC Press, Boca
Raton, FL.
6. Gtinthard, H.H. and Primas, H (1956) Helv. Chim. Acta 39,1645.
7. Gutman, I. and Trinajstic, N. (19:3) Topics Curro Chern. 42,49.
8. Coulson, e.A., O'Leary, B. and Mallion, R.B. (1978) Htickel Theory for Organic
Chemists, Academic Press, London.
9. Trinajstic, N. (1991) in D.H. Rouvray and D. Bonchev (eds.), Chemical Graph
Theory: Introduction and Fundamentals, Gordon&Breach/Abacus Press, New
York, p. 235.
10. Dias, J.R. (1993) Molecular Orbital Calculations Using Chemical Graph Theory,
Springer, Berlin.
11. Balasubramanian, K. J (1991) Math. Chern. 7, 353.
12. Haymet, A.DJ. (1986) J. Amer. Chern. Soc. 108,319.
13. Stone, AJ. and Wales, DJ. (1986) Chern. Phys. Lett. 128,501.
14. Burdett, J.K. (1987) Struct. Bonding (Berlin) 65,29.
15. Wang, Y., George, T.F., Lindsay, D.M. and Beri, A.e. (1987) J. Chern. Phys. 86,
3493.
16. Mestechkin, M.N. and PoItavets, V.N. (1988) J. Struct. Chern. 29,461.
17. Dias, J.R. (1989) J. Chern. Educ. 66,1012.
18. Klein, DJ. (1990) Reports Mol. TheOlY 1, 91.
19. Brendsdal, E., Cyvin, SJ., Cyvin, B.N., Brunvoll, J., Klein, DJ. and Seitz, W.A.
(1991) in I. Hargittai (ed.), Quasicrystals, Networks and Molecules with Fivefold
Symmetry, VCH Publishers, New York, p. 257.
20. Manopoulos, D.E. (1991) J. Chern. Soc. Faraday Trans. 87, 2861.
21. Manopoulos, D.E., May, J.e. and Down, S.E. (1991) Chern. Phys. Lett. 181, 105.
22. Harary, F. (1971) Graph Theory, Addison-Wesley, Reading, MA, 2nd printing.
23. Cyvin, SJ. and Gutman, I. (1988) Kekule Structures in Benzenoid Hydrocarbons,
Springer, Berlin.
24. Cvetkovic, D.M., Doob, M. and Sachs, H. (1980) Spectra of Graphs: Theory and
Applications, Academic Press, New York.
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 69

25. Coulson. CA (1950) Proc. Cambridge Philos. Soc. 46. 202.


26. Trinajstic. N. (1988) J. Math. Chern. 2.197.
27. Krivka. P.• Jericevic. Z. and Trinajstic. N. (1986) Int. J. Ouantum Chern.:
Ouantum Chern. Symp.19. 129.
28. Nikolic. S .• Trinajstic. N .• Mihalic. Z. and Carter. S. (1991) Chern. Phys. Lett.
179.21.
29. Hiickel. E. (1932) Z. Phys. 60.204.
30. Streiwieser. Jr .• A. (1961) Molecular Orbital Theory for Organic Chemists.
Wiley. New York.
31. Heilbronner. E. and Bock. H. (1976) The HMO Model and Its Applications.
Wiley. London.
32. Bloch. F. (1929) Z. Phys. 52. 555; (1930) ibid. 61. 206.
33. Ruedenberg. K. (1954) 1. Chern. Phys. 22. 1878.
34. Ruedenberg. K. (1961) 1. Chern. Phys. 34. 1861.
35. Trinajstic. N. (1977) in GA Segal (ed.) SemiempiIical Methods of Electronic
Structure Calculations. Part A. Techniques. Vol. 7. Plenum Press. New York. p.
1.
36. Cvetkovic. D.• Gutman. 1. and Trinajstic. N. (1974) Chern. Phys. Lett. 29. 65.
37. Herndon. W.e. (1974) Tetrahedron Lett. 671.
38. Zivkovic. T .• Trinajstic. N. and Randic. M. (1975) Mol. Phys. 30. 517.
39. Longuet-Higgins. H.e. (1950) 1. Chern. Phys. 18.265.
40. Clar. E.• Kemp. W. and Stewart. D.e. (1958) Tetrahedron 3. 36.
41. Fukui. K. (1970) Topics Curf. Chern. IS. 1.
42. Fukui. K. (1971) Acc. Chern. Res. 4. 57.
43. Fukui. K. (1975) Theory of Orientation and Stereoselection. Springer. Berlin.
44. Fleming. 1. (1976) Frontier Orbitals and Organic Chemical Reactions. Wiley.
London.
45. Fukui. K. (1982) Science 218. 747.
46. Fukui. K. (1982) Angew. Chern. Int. Edit. Engl. 21. 801.
47. Polansky. O.E. and Zander. M. (1982) J. Mol. Struct.. 84. 361.
48. Polansky. O.E. (1984) 1. Mol. Struct.. 113.281.
49. Polansky. O.E. (1986) in N. Trinajstic (ed.). Mathematics and Computational
Concepts in Chemistry, Horwood, Chichester. p. 262.
50. Polansky. O.E. (1989) in A. Graovac (ed.). MATH/CHEM/COMP 1988.
Elsevier. Amsterdam. p. 65.
51. Polansky. O.E. (1990) in Z.B. Maksic (ed.). Theoretical Models of Chemical
Bonding. Part 1: Atomic Hypothesis and the Concept of Molecular Structure,
Springer, Berlin. p. 29.
52. Ref. 4, p. 155.
53. Heilbronner. E. (1953) Helv. Chim. Acta 36.170.
70 N. TRINAJSTIC ET AL.

54. Graovac, A., Gutman, I. and Polansky, O.E. (1984) Monat. Chern. 115, I.
55. Motoc, I., Silverman, J.N. and Polansky, O.E. (1983) Phys. Rev. A28, 3673.
56. Motoc, I., Silverman, J.N. and Polansky, O.E. (1984) Chern. Phys. Lett. 103,285.
57. Motoc, I. and Polansky, O.E. (1984) Z. Naturforsch 39b, 1053.
58. Motoc, I., Silverman, J.N., Polansky, O.E. and Olbrich, G. (1985) Theoret. Chim.
Acta 67,63.
59. Clar, E. (1964) Polycyclic Hydrocarbons, Academic, London.
60. Clar, E. and Schmidt, W. (1977) Tetrahedron 33, 2093.
61. Hiickel, E. (1932) Z. Physik 76,628.
62. Moffitt, W. (1950) Proc. Roy. Soc. (London) A 200, 414.
63. Walsh, AD. (1953) J. Chern. Soc .. 2260.
64. Walsh, AD. (1953) J. Chern. Soc .. 2265; (1953) ibid. 2288.
65. Fukui, K., Yonezawa, T. and Shingu, H. (1952) J. Chern. Phys. 20, 722.
66. Fukui, K., Yonezawa, T., Nagata, C. and Shingu, H. (1954) J. Chern. Phys. 22,
1433.
67. Dewar, M.J.S. (1952) J. Amer. Chern. Soc. 74, 3341; (1952) ibid. 74, 3345;
(1952) ibid. 74,3350; (1952) ibid. 74, 3353; (1952) i!lliL. 74,3357.
68. Woodward, R.B. and Hoffmann, R. (1971) The Conservation of Orbital
Symmetry, VCH, Weinheim.
69. Coulson, C.A. and Rushbrooke, G.S. (1940) Proc. Cambridge Philos. Soc. 36,
193. A nice account about the discovery of the Pairing Theorem is given by
Mallion, R.B. and Rouvray, D.H. (1990) in J. Math. Chern. 5, I.
70. Ruedenberg, K. and Scherr, C.W. (1953) J. Chern. Phys. 21,1565.
71. Ruedenberg, K. (1954) J. Chern. Phys. 22,1878.
72. Gutman, I., Knop, J.V. and Trinajstic, N. (1974) Z. Naturforsch 29b, 80.
73. Gutman, I. (1979) Z. Naturforsch 35a, 458.
74. Bonchev, D., Mekenyan, O. and Trinajstic, N. (1980) Int.. J. Ouantum Chern.
17,845.
75. Kiang, Y.-s. and Chen, E.-t. (1983) Pure App!. Chern. 55, 283.
76. Hall, G.G. (1977) Mol. Phys. 33, 551.
77. Gutman,1. and Rouvray, D.H. (1979) Chern. Phys. Lett.. 62, 384.
78. Graovac, A and Gutman, I. (1980) Croat. Chern. Acta 53, 45.
79. Coulson, C.A. and Streitwieser, Jr., A. (1965) Dictionary of 1t-Electron
Calculations, Freeman, San Francisco.
80. Graovac, A, Gutman, I., Trinajstic, N. and Zivkovic, T. (1972) Theoret. Chim.
Acta 26, 67.
81. Wilcox, Jr., C.F. (1968) Tetrahedron Lett., 795.
82. Wilcox, Jr., c.F. (1969) J. Amer. Chern. Soc. 91, 2732.
83. Klein, DJ., Schmalz, T.G., EI-Basil, S., Randic, M. and Trinajstic, N. (1988) L
Mol. Struct. (Theochem) 179, 99.
THE INTERPLAY BETWEEN GRAPH THEORY AND MOLECULAR ORBITAL THEORY 71

84. Oias, J.R. (1990) J. Mol. Struct. (Theochem) 206, 1.


85. John, P. (1991) J. Mol. Struct. (Theochem) 231, 379.
86. Trinajstic, N., Nikolic, S., Knop, J.V .. MUller. W.R. and Szymanski. K. (1991)
Computational Chemical Graph Theory: Characterization. Enumeration and
Generation of Chemical Structures by Computer Methods, Simon & Schuster.
New York.
87. Parr. R.G. and Pearson. R.G. (1983) J. Amer. Chern. Soc. 105.7512.
88. Pearson, R.G. (1986) Proc. Natl. Acad. Sci. USA 83,8440.
89. Zhou, Z., Parr, R.G. and Garst, J.F. (1988) Tetrahedron Lett., 4843.
90. Zhou, Z. and Pan, R.G. (1989) J. Amer. Chern. Soc. 111, 7371.
91. Zhou. Z. and NavanguI, H.V. (1990) J. Phys. Org. Chern. 3. 784.
92. Pearson, R.G. (1987) J. Chern. Educ. 64, 561.
93. e.g., Arnie, O. and Trinajstic. N. (1991) J. Chern. Soc. Perkin Trans. II, 891.
94. Gimarc. B.M. (1983) J. Amer. Chern. Soc. 105. 1979.
95. Longuet - Higgins, H.C., Rector. C.W. and Platt, J.R. (1950) J. Chern. Phys. 18.
1174.
96. Gimarc, B.M. and Joseph. J.J. (1984) Angew. Chern. Int. Edit. Engl. 23,506.
97. Gimarc. B.M. and Ott, J.J. (1986) in N. Trinajstic (ed.), Mathematics and
Computational Concepts in Chemistry. Horwood. Chichester, p. 74.
98. Gimarc, B.M. and Ott, 1.1. (1986) J. Amer. Chern. Soc. 108,4298.
99. Gimarc. B.M. and Ott, 1.1. (1986) J. Amer. Chern. Soc. 108.4303.
100. Ott, 1.1. and Gimarc. B.M. (1986) J. Comput. Chern. 7, 673.
101. Gimarc, B.M. and Ott, J.J. (1986) Inorg. Chern. 25, 83; (1986) ibid 25, 2708;
(1989) ibid 28, 2560.
102. Gimarc, B.M. and Ott, 1.1. (1987) J. Amer. Chern. Soc. 109, 1388; (1990) ibid
112,2597.
103. Gimarc. B.M. and Ott, JJ. (1991) Croat. Chern. Acta 64, 493.
104. Aihara, J.-i. (1988) Bull. Chern. Soc. Japan 61, 2309.
105. Katz, TJ. and Rosenberger, M. (1962) J. Amer. Chern. Soc. 84, 865.
106. Ghaisas, VV. and Tilak, B.D. (1965) Proc. Indian Acad. Sci. 39 A, 14.
107. Gronowitz, S .• Ruden, U. and Gestblom, B. (1963) Arkiv Kemi 20, 297.
108. Wynberg, H. and Zwanenburg, OJ. (1967) Tetrahedron Lett., 761.
109. Litvinov, V.P. and Gold'farb. Y.L. (1976) Adv. Heterocycl. Chern. 19, 123.
110. Hoffmann, R. (1963) J. Chern. Phys. 39, 1397.
111. Gimarc, B.M. pIivate communication.
112. Rein, R., Fukuda, N., Win, H., Clarke, G.A. and Harris, F.E. (1966) J. Chern.
Phys. 45,4743.
1l3. von Schnering, H.G. and Menge, G. (1981) Z. Anorg. AUg. Chern. 481,33.
114. Hassel, O. and Viervoll, H. (1947) Acta Chern. Scand. 1, 149.
115. Akisin, P.A., Rambidi, N.G. and Ezov, S.Y. (1960) Zh. Neorg. Khim. 5, 747.
72 N. TRINNSTIC ET AL.

116. Whitfield, J. (1970) J. Chern. Soc. A, 1800.


117. Blachnik, R. and Wickel, U. (1983) Angew. Chern. Int. Edit. Engl. 22, 317.
118. Livingstone, R.L. and Rao, C.N.R. (1959) 1. Arner. Chern. Soc. 81, 285.
119. Sommer, A., White, D., Levinsky, MJ. and Mann, D.E. (1963) J. Chern. Phys.
38,47.
120. Pierce, L., Nelson, R. and Thomas, C. (1965) J. Chern. Phys. 43, 3423.
121. Brown, R.D. (1952) Ouart. Rev. 6, 63.
122. Wheland, G.W. (1942) 1. Phys. Chern. 64, 900.
123. Ref. 8, p. 123.
124. Murrell, J.N. and Harget, A.1. (1972) SCF MO Theory of Molecules, Wiley-
Interscience, London, p. 77.
125. e.g., Zhou, Z. and Parr, R.O. (1990) 1. Arner. Chern. Soc. 112,5720.
126. Gutman, I., Trinajstic, N. and Wilcox, Jr., C.F. (1975) Tetrahedron 31, 143.
127. Wilcox, Jr., C.F., Outman, I. and Trinajstic, N. (1975). Tetrahedron 31, 147.
128. Dewar, M.J.S. and Sampson, R.I. (1956) J. Chern. Soc. 2789.
129. Dewar, MJ.S., Mole, T. and Warford, E.W.T. (1956) J. Chern. Soc. 3581.
130. Leffler, J.E. and Grunwald, E. (1963) Rates and Equilibria of Organic Reactions,
Wiley, New York.
131. Biermann, D. and Schmidt, W. (1980) Israel J. Chern. 20, 312.
132. Szentpaly, L.v. and Herndon, W.C. (1984) Croat. Chern. Acta 57, 1621.
133. Zander, M. (1990) Topics Curro Chern. 153, 101.
134. Graovac, A. and Trinajstic, N. (1975) Croat. Chern. Acta 47,95.
135. Oraovac, A. and Trinajstic, N. (1976) 1. Mol. Struct.. 30,416.
136. Klein, 0.1., Cravey, MJ. and Hite, O.E. (1991) Polycyclic Aromatic Compounds
2,163.
137. Plavsic, D., Trinajstic, N. and Klein, OJ. (1992) Croat. Chern. Acta 65,279.
138. Kroto, HJ. (1992) Angew. Chern. Int. Edit. Eng!. 31, 111.
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL
THEORY: A COMPARISON OF J12-SCALED HUCKEL
THEORY AND RESTRICTED HARTREE-FOCK THEORY
FOR BORANES AND CARBORANES

ROGER ROUSSEAU AND STEPHEN LEE


University of Michigan
Willard H. Dow Chemistry Laboratory
Ann Arbor, MI 48109-1055

1. Introduction

Forty years ago E. Wigner and F. Seitz wrote on electronic structure


calculations for metals 1: "If one had a great calculating machine, one might
apply it to the problem of solving the SchrOdinger equation for each metal
and obtain thereby the interesting physical quantities, such as the cohesive
energy, the lattice constant, and similar parameters. It is not clear, however,
that a great deal would be gained by this. Presumably the results would
agree with the experimentally determined quantities and nothing vastly new
would be learned from the calculation. It would be preferable instead to
have a vivid picture of the behavior of the wave functions, a simple
description of the essence of the factors which determine cohesion and an
understanding of the origins of variation in properties from metal to metal."
In the intervening years electronic structure calculations for both
metals and molecules have fundamentally changed. Today, great calculating
machines do exist and to an increasing extent the SchrOdinger equation can
be solved with sufficient accuracy to resolve questions about the structure
and reactivity of molecules. 2 At the same time the qualitative pictures used
to relate electronic structure to geometrical structure have not progressed at a
similar pace. If anything, the overall pictures have grown less rather than
more vivid. Indeed, some of our simplest concepts3 such as Huckel theory,
the overlap of valence atomic orbitals and the isolobal analogy appear to
some almost tamished.
In this paper we examine the degree to which simple concepts such
as the use of minimal basis sets, the importance of the two-center one-
electron integral and the Wolfsberg-Helmholz approximation of orbital
interactions 3b speaks to the true energetics of the boranes, a family of
molecules with prototypic covalent bonds. Throughout this paper we will
directly compare the energies of a minimal basis set modified Huckel theory
73
D. Bonchev and O. Mekenyan (eds.), Graph Theoretical Approaches to Chemical Reactivity, 73-109.
© 1994 Kluwer Academic Publishers.
74 R. ROUSSEAU AND S. LEE

to large basis set (6-31G*) ab-initio Hartree Fock calculations for clusters
such as BgHg2-, B9H92- and B 10H 102-. The electronic structure of these and
related boranes and carboranes form an active field of research. In the last
year there have been at least three ab-initio calculations on the BgHg2- cluster
alone.4
In this paper, we use a form of Huckel theory which includes a
parameterless pairwise repulsive potential. This modification, which we call
second moment scaling, has recently proven successful in both rationalizing
and optimizing the structures of molecules and extended solids. The method
is based on earlier work of D. Pettifor and R. Podloucky and of 1. K.
Burdett and the authors of this paper.S Recently, we have found the method
useful in studying covalent or metallic (but not ionic) compounds where the
valence orbitals are fairly tightly bound (Le., late transition metal and main
group atoms)6. We have used second moment scaled Hamiltonians to find
reasonable energy minima for the 58 atom unit celled n-Mn structure, the
highly anisotropic gallium structure, the icosohedral packing of elemental
boron and the cia ratio of hexagonal closest packed metal alloys. Molecular
examples of geometrical optimizations include organometallic clusters such
as OSS(COh6, Ir4(CO)I2, [R~(COh6]2-, anti-Wade's rule compounds such
as C4B4Hg and C4BgHgR4 (R=a1kyl group) and organic compounds such as
napthalene, spiropentane, pentalene and Coo. In a similar vein we have
rationalized the Hume-Rothery electron concentration rules for main group
and transition metal alloys, Wade's rules for clusters, the VSEPR and the
octet rule. The difference between the current work and these earlier projects
is our interest here, not just in the global energy minimum, but in the actual
shape of the electronic energy surface as a function of geometry. A
knowledge of such surfaces is of course necessary to study vibrations and
reactivity in general.

2. Calculational Method

Our theory is based on the Huckel or tight-binding method in which


a parameterless linear pairwise repulsive energy is added to the sum of
energies of filled molecular orbitals, EHii. The inclusion of such a repulsive
energy in tight-binding Hamiltonians was suggested a number of years
ago.? Recently W.M.C. Foulkes and R. Haydock have derived the
existence of such an expression from first principles density functional
theory. g In our work we note that a pairwise additive repulsive energy
must be proportional to the coordination number of the atoms in the system,
C. This idea that the repulsive energy is proportional to coordination
number is an old one, an early application being in the Born model of ionic
bonding. 9 It has recently been used with good effect in metallic and
covalent systems. 1O In particular V. Heine et al. have shown that it gives
reasonable agreement with pseudo-potential ab-initio calculations on
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 75

elemental aluminum in several different hypothetical geometries. As has


been shown by Friedel and Cyrot-Lackmann ll we can relate C to the
Hiickel molecular orbital energies as:
. N
C = £..J (Ei - Eo) 2
'Y ""
N
i=1
where Ej is the ith molecular orbital, N is the total number of molecular
orbitals, Eo is the average energy of the isolated atomic orbitals and y is a
proportionality constant. We therefore fmd that the total energy, ET is:
N M
ET (r) = 'Y L (Ei (r) - Eo)2 + 2 L Ei (r)
~l ~l

where we note that Ei is a function among other things of the overall size of
the system, r, and where M is the index of the highest occupied molecular
orbital (HOMO). The first term on the right hand side of the equation is the
repulsive energy, U(r), while the second is the attractive energy, - V(r).
We now follow the argument first discussed by D. G. Pettifor. 12
We consider two systems which we label I and 2. The terms ETl, UI. VI,
ET2, U2 and V2 refer to the various energies of these two systems. We
wish to calculate ~, where ~ = ETl -ET2 It may be seen that,
~E = U I (rleq) - V I (rleq) - U 2 (r2eq) + V 2 (r2eq)
where fIeq and r2eq refer to the respective equilibrium sizes of the two
systems.
We use the fact that we are interested in equilibrium geometries in
the following way. Note that at equilibrium to first order in distance, ETi(r)
is constant. Therefore,
U 2 (r2eq) - V2 (r2eq) "" U 2 (r2eq + d) - V 2 (r2eq + d) (1)
In particular we choose a value for d such that U2 (r2eq + d) = U 1(fIeq).
We now find that:
M M

~E = 2 L Eli (rl eq ) - 2 L E2i (r2eq + d) (2)


i=1 i=1
We determine the value of fIeq from the true experimental size of system I
and the value d from the equality:
N N
L (Eli (rl eq ) - E/ = L (E2i (r2eq + d) - E/ (3)
i=1 i=1
We note that the above equates the variance of the molecular orbital energies
in system I and 2. In Huckel theory the mean of the molecular orbitals is a
constant with respect to changes in geometry. Equation (2) can therefore be
simplified to
76 R. ROUSSEAU AND S. LEE

N N
L
i=l
Eli 2 (rleq) = L ~i
i=l
2 (r2eq + d)

We note that both the expression to the left and right of the equal sign are
the second moments of the molecular orbital energies, ~2. In particular
equations 2 - 4 state that the differences in energy between two structural
alternatives can be calculated from knowledge of the molecular orbital
energies alone. It should however be noted that the approximation given in
equation (1) breaks down if the deviation in the values of ~2 become
significant.
To calculate these molecular orbital energies we use a minimal
valence basis set. The Hamiltonian diagonal elements equal the energies of
the isolated atomic orbitals while off-diagonal elements are calculated using
the Wolfsberg-Helmholz approximation,I3
K
R·1J = -2 S··1J (R·11 + R·)JJ
where K is a constant traditionally set at 1.75 and Sij is the overlap integral
between the i th and jth atomic orbitals. The values for the diagonal Hii
terms are taken from work of the R. Hoffmann group for extended HUckel
calculations. 14 The Sij integrals are based on Slater type orbitals (STO)
with single or double zeta expansions. Again the values of the STO
exponents, ~, are taken in conformity with values used in extended HUckel
theory. It is important to note that the standard literature values for
extended HUckel parameters are quite close to Hartree-Fock calculations. 15
For example the extended HUckel parameters for boron are Hjj(2s) = -15.2
eV, Hjj(2p) = -8.5 eV, ~(2s) = 1.3 and ~ (2p) = 1.3. These correspond to
the atomic Hartree-Fock parameters which are Hjj(2s) =-13.46 eV, Hjj(2p)
= -8.43 eV, ~(2s) = 1.288 and ~(2p) = 1.211. Except in the case of
contracted valence orbitals (such as the d orbitals in Zn or the s orbitals in
TI) we do not adjust Hiickel or tight-binding parameters to improve our fit
to experiment. In particular, in the current work on boranes and carboranes
we have made no alteration to the literature parameters for boron, carbon or
hydrogen.
In practice the second moment scaled calculations reduce to the
following. When comparing two structural alternatives we calculate the
molecular orbital energies of one of the structures at its true equilibrium
size. For the second structure we scale its size so that its second moment
exactly equals the second moment of the first. We then fill both molecular
orbital diagrams with the requisite number of electrons and then calculate
the difference in total electronic energies. We note that the constant 'Y
remains undetermined in this procedure. We therefore study only the
structural shape and not the overall size of the geometries in question. The
chief advantage of this method of calculation is that it allows one to retain
all the insights garnered from simple molecular orbital theory. Important
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 77

concepts include the overlap of valence atomic orbitals, the HOMO -


LUMO gap energy, the use of frontier orbitals and finally the utility of
minimal valence basis sets in determining molecular orbital energetics.

3. The Method of Moments

One of the chief strengths of this method comes from its


connection with the method of moments. 5•11 •16 This moment method is

J:
based on the following observations. First, knowledge of J.1n, where

Jln = En p(E,r) dE, can be used to exactly determine the function p(E,r).
The most advantageous transform technique uses a continued fraction
expansion 16. Second. the J.1n may be related to specific structural features, as
J.1n is the sum of all closed paths of n-steps in which one hops from one
valence atomic orbital to the next. Third, the earliest J.1n i.e., J.Io, J.11 and J.12 are
all structure invariants: J.Io is normalized to equal 1, J.11 is just Tr(H) and is
therefore a constant sum of the Hartree-Fock atomic orbital energies, and
finally J.12 is treated as a constant in our variance scaling method (see eq. (2».
Finally we note that while it is necessary to know all the J.1n to determine
p(E,r) exactly, it is only the first few moments which control the principal
features of the attractive energy, VCr). As we discuss below, knowledge of
J.13 through J.16 is often sufficient in calculating energy differences between
structures. This is particularly true if one uses the continued fraction
expansion in conjunction with the upper and lower energy limits of p(E,r)
(which we call respectively Eu and El). This use of Eu and El can be
important. The reason is that the higher moments are increasingly dominated
by these two values. In the absence of exact knowledge of these higher
moments, Eu and EJ have a significant role. (It should be noted that Eu and
El are also related to local structural features; E 1 depends on the coordination
number, C, and Eu + El depends on the degree ofnon-altemancy).17
As an example of this method we consider band calculations for the
fourth row of the main group. In particular we consider the elemental
structures of Cu, Zn, Ga, Ge, As and Se (elements 29-34 of the periodic
table).1 8 Copper and zinc are respectively face centered cubic (fcc) and
hexagonally closest packed (hcp), gallium adopts an unusual seven
coordinate structure, germanium forms in a diamond lattice, arsenic fonns a
three coordinate two-dimensional puckered honey-combed sheet, while
selenium adopts an infinite one dimensional helix. 19 We therefore need to
compare p(E,r) for each of these six structures types.
For meaningful comparisons we need to calculate the electronic
energies of each of these six structures for the same atom type. The Hartree-
Fock energies of the valence 4s orbital ranges from a(4s)= -6.5 to -22.9 eV
while the 4p orbital energies range from a(4p)= -5.7 to -12.4 eV.17 The ~
(4s) exponent of the STO's range from ~ (4s)= 1.21 for Cu to 2.4 for Se and
similarly the ~ (4p) exponent ranges from ~ (4p)= 1.6 to 2.1. With this great
78 R. ROUSSEAU AND S. LEE

disparity in parameters, it would at ftrst appear necessary to calculate 36


separate band structures, as for each of the six atom types we would need to
compare all six of the possible structural alternatives. In practice it turns out
not to be necessary. This is so as the differences in energy between
structures are reasonably insensitive to changes in the Hartree-Fock
parameters. In Figure I we show the difference in energy between these six
structures as a function of electron ftlling of the valence bands for a single set
of atomic parameters. Values chosen were a(4s)= -16.OeV, a(4p)= -9.OeV,
~(4s)= 2.16 and ~(4p)= 1.85. These are the extended Hiickel parameters for
Ge developed by Thorn and Hoffmann. 14 Germanium was chosen as it lies
in the middle of the sequence of the six elements and therefore has average
parameters with respect to the full series. The differences in energy plotted in
Figure la are between the structure of the labelled element and the diamond
structure of germanium. Figure 1 is plotted with the convention that the
curve with the most positive value at a given electron count corresponds to
the most stable structure. For example Ga is the most stable structure for a
combined sand p band ftlling of 0.3 to 0.4. The results in Figure la match
the elemental periodic trends exactly; each element is calculated as being most
stable in its observed structure. What perhaps was not clear from our earlier
work is that the method of moments can be used to quantitatively account for
these results. In Figure Ib we use only the values of ~3- Il6 together with the
Eu and El to calculate an approximate set of Ali functions.l 6 It may be seen
that Figure Ib corresponds with the results of Figure la in both the energy
scale and the shape of the ftve functions. As our principal input are the
values of J.L3 - J.L6 we can furthermore trace the provenance of a given
structure's stability. In this respect it is useful to recall the effect of each
individual moment. 16 These effects are summarized in Figure 2. A large
negative J.L3 stabilizes electron band ftllings below 0.5, a large J.14 stabilizes
nearly ftlled band systems and a large J.16 stabilizes the half-ftlled band.20 We
recall that J.Ln are the sums of all closed paths of n-steps in which via the
hopping integral one hops from one orbital to the next. Therefore, the J.L3, J.14
and J.L6 values correspond in part to the number of atoms bonded in
respectively triangular, square and hexagonal arrangements. (As we will
discuss later, other structural effects change the various moments. In
particular the number of bonds and bond angles alter J.14.l6) We therefore
conclude that the large number of triangles in the fcc and hcp structures
stabilize these structures at low electron counts, the slightly fewer number of
triangles stabilze the gallium system, the myriad of hexagons stabilize the
diamond structure and so forth.

4. Elemental Boron 21

With the exception of sulfur, no element displays greater


polymorphism in its crystal structures than boron. 19 Nine polymorphs
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 79

a 4.48 All Momenta

-- E
0
2.24

-•
. ; 0.00
UI
CI
-2.24

..4048

Band Filling

b 4.44 FIrat Six Momenta

-
2.22

-•
E
0
.!!

-
> 0.00
UI
CI
-2.22

....44
~--------------------------~
Band Filling

FiguR 1. Differences in energy between the SIr1ICWrC typeS of elemenu 29-34 as a


function of fractional S and P band-r1lling. Energies are reponed as the differences in
energy to a fixed reference suucmre (in chis case the diamond SIrUClUJ'e of Ge). In (a) we
show the tcSUlu for full band calculations while in (b) only the values of 113·116. Eu and EI
were used. See discussion of FiguR 1 in the CCXt for fipre c:cmvcntions.
80 R. ROUSSEAU AND S. LEE

lJ..J
<l

Figure 2. Diffcrcnccs ill energy bc:tweca SllUClUrt:S whic:b have aiangles. hexagons.
pentagons 01' squares in their SIrUCtUrC • • function of x. the fractional band filling.
Results are taken from ref. 11. Sec discussion of Figure 1 for fisure conventions.

Figure 3. Two views of the icosohedron. On the left it is seen down I 2·fold axis
while on the ri&ht it is viewed down I 3·foid uis.
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 81

have been reported with unit cells ranging from a twelve atom
rhombohedral cell to a 1708 atom cubic cell. The structures of five of these
polymorphs have been fully resolved. In all five, regular icosohedra playa
significant role. Structures derived from icosohedra play an equally
ubiquitous role in molecular boron chemistry.22 We show this icosohedron
from two perspectives in Figure 3.
In our calculations we will concentrate on the simplest of the boron
polymorphs, the R-12 structure, which contains twelve atoms in a
rhombohedral cell. Each unit cell of this structure contains one icosohedron
whose center can be placed at the cell axes origin. As boron has three
valence electrons, there are a total of 36 valence electrons per primitive unit
cell; of these 36 electrons, 26 are used in intra-icosohedral bonds and 10 in
inter-icosohedral bonds.
There are five crystallographic parameters which control the shape
of the R-12 polymorph. They are the rhombohedral cell angle a and the
atomic x and z fractional coordinates for the two symmetry inequivalent
boron atoms. In Table I we show our optimal values for these paramters
using our variance scaling technique. The six inequivalent bonds in the R-
12 structure are found experimentally to have the lengths of 2.021, 1.787,
1.785, 1.777, 1.733 and 1.709A. Theoretically (if we assume an optimal
value of-O we find these bond lengths to be respectively 1.90, 1.88, 1.89,
1.79, 1.56, 1.78A. While our calculated bond lengths are roughly in the
correct order in going from longest to shortest lengths, the numerical
agreement is poor. The average error in bond lengths is o.09A.
It is instructive to compare the R-12 structure with reasonable
crystallographic equivalents in order to elucidate the most significant
structural feature of the R-12 structure. As our primary interest is with the
inter-icosohedral bonds we consider alternative packings of these
icosohedra. In particular for the sake of numerical simplicity we consider
systems which have exactly one icosohedron per unit cell. Furthermore we
assume that the icosohedra line up in such a way as to preserve some
portion of the point group symmetry of the individual clusters. Of the three
types of rotational axes (five-fold, three-fold and two-fold) only the three-
fold and two-fold axes are compatible with translational crystalline
symmetry. These symmetry axes are found in the trigonal, orthorhombic
and monoclinic crystal classes. We consider here only the higher symmetry
trigonal and orthorhombic lattices. We recall that there are four types of
orthorhombic Bravais lattices (primitive, face-centered, end-centered and
body-centered) and only two types of trigonal Bravais lattices (primitive
and rhombohedral). We therefore need to explore these six different
Bravais lattices. We therefore optimized elemental boron assuming that its
structure corresponded to one of these six different lattice types. In each
case we maintained a perfect icosohedral shape for the individual clusters.
In Figure 4 we compare the differences in energy of these polymorphs. It
may be seen that at an sand p band-filling of 0.375 (which corresponds to
the fractional band filling of elemental boron) the experimentally observed
82 R. ROUSSEAU AND S, LEE

Table I. Crystal Parameters for the Boron R-12 Structure

.Expc:rimcnt Theory

a 5.057A
a 58.060 56.7
B(1)x 0.010 0.00
B(l)z 0.657 0.67
B(2)x 0.221 0.22
B(2)z 0.632 0.62

-6 0• B
~
~ prim. trig. C-cent. ortho.
fO. 4 l rhomb~edralprim.or1hO.
~ o.o~~~~~~~~~~~~~~~~~

-0.4

-O.B ____~~____~____~~____~____~

-
~

d d d d d
.
Figure 4. Differences in energy between the high symmetry Bravias latticcs where
th= is one icosahedra per unit cell as a function of fractional band-filling. At the fractional
band-filling of boron. 0.375. the rhombohedral fonn is prcfcmd. See the discussion of
FlJUfC 1 for fiaurc conventions.
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 83

rhombohedral form (R-12) is the most stable. At lower electron counts the
primitive trigonal and face centered orthorhombic structures are more stable
while at higher band fillings the C-centered and primitive orthorhombic
cells are energetically preferred.
We now apply the method of moments to determine the specific
structural causes for these energy differences. In particular, we will
consider the primitive trigonal form as an example of a polymorph stable at
low band fillings and the C-centered orthorhombic lattice as an example of a
phase stable at high band fillings. It is instructive to first consider in detail
the pertinent structural features of these phases. In Figure 5 we illustrate
the rhombohedral (R-12), C-centered orthorhombic, and primitive trigonal
polymorphs. In the middle of Figure 5, we portray the rhombohedral
structure viewed down primarily the hexagonal [001] axis. It may be seen
that in the (hexagonal) a - b crystallographic plane the inter-icosohedral
bonds form both triangles and squares of bonded atoms. There are two
such triangles and three such squares per unit cell. Also shown in Figure 5
is a triangle of atoms connected to the regular icosohedra by 1.71A bonds.
This triangle represents the base of an icosohedron in the next higher plan
in this structure. These 1.71A bonds lie on inter-icosohedral hexagons of
bonded atoms. We note that these 1.71A bonds point radially outward
from the icosohedron. On the bottom of Figure 5 we illustrate the primitive
trigonal cell viewed down primarily the [001] axis. It is identical to the
rhombohedral cell within the a - b plane. It differs in the positioning of the
out-of plane icosohedra which in the primitive trigonal structure lie directly
above the lower icosohedra. The bases of four of the out-of-plane
icosohedra are shown in Figure 5. It may be seen that the interlayer cavities
are octahedra. As octahedral faces are triangles, these out-of-plane
octahedra increase the 113 value for the primitive trigonal lattice
significantly.
On the top of Figure 5 we illustrate the C-centered orthorhombic
structure viewed primarily down the [100] direction. It may be seen that
there are squares involving inter-icosohedral bonds in this structure normal
to both the a and b directions. Bond angles between these inter and the
intra-icosohedrallinks therefore are as small as 90°. Harder to see are the
hexagons of bonds normal to the C axis. It is interesting to note that the a -
b plane of the icosohedra found in the C-centered orthorhombic structure is
identical to sheets found in the rhombohedral structure.
In Figure 6 we show the differences in energy between these three
structures using only 113, 114, Eu and EI. These curves reproduce many of
the qualitative features of the full band calculations. In particular, it may be
seen that the primitive trigonal structure is stable for low fractional band
fillings while the C-centered orthorhombic structure is stable for high band
fillings. These differences in energy can be explained in terms of local
structural features. The difference in energy between the primitive trigonal
and the rhombohedral geometry is due to the larger 113 in the former
geometry. This difference in 113 is due to the formation of octahedral
84 R. ROUSSEAU AND S. LEE

C- centered orthorhorrbic

rhombohedral

primitive trigona

Figure 5. The optimal structures of the C-centered orthorhombic,


rhombohedral and primitive trigonal Bravais lattices of single icosohedra.
TOPOLOGICAL CONTROL OF MOLECULAj ORBITAL THEORY 85

0.84
JL3 - 1&4 only

G.32

eo
-
-•
.!! 0.00 ~~~""'----J'------"'-"""--"""?I
>
IU
<I
-0.32

-0.64 L......_ _ _ _ _ _ _ _ _ _ _ _ _- - - I

Band Filling

"pre ,. Difference ill eDCI'I)' between the primidve aiiODal. C-ceDtered


onIIomombic and the rhombohedral Bravais laaices of sinJle ic:osohedra as a funcdon of
fracDcml bud fiI1iDJ. Tbc results sbowa bI:re 1m the CXJIIIiDued fraction ~ using only
JL30 ..... Eta IIId El.

Table n Compariloa or RBF. JLz-BOckei ad X-ny Crystal


Strueture Bond Distances 'or B,H,Z•• 8,8,Z. ad Bl0H 102•
BsHF
Bond Exp. JL2 6-31G·

a 1.56 A l.5sA 1.70 A


b 1.72 1.62 1.69
c 1.76 1.82 1.83
d 1.93 1.92 1.96

B9HY
Bond Exp. JL2 6-31G·

a 1.71 1.69 1.72


b 1.84 1.80 1.83
c 1.90 1.97 1.98

BloHl02-
Bond Exp. JL2 6-31G·

a 1.68 1.6S 1.70


b 1.79 1.78 1.83
c 1.82 1.87 1.8S
86 R. ROUSSEAU AND S. LEE

cavities between the individual icosohedral units (octahedra have eight


triangular faces).
The difference in energy between the rhombohedral and C-centered
orthorhombic structures is due to the differentfourth moments for the two
structures. The C-centered orthorhombic structure has twice as many
squares of bonded atoms as does the rhombohedral structure. Although
this contributes to the larger ~4 of the C-centered cell, the principal
difference in ~4 is caused by the inter-icosohedral bond angles. In
particular, ~ is minimized when the inter-icosohedral bonds point radially
outward. The six 1.71A bonds in the rhombohedral structure are oriented
in exactly this manner. By contrast none of the inter-icosohedral bonds in
the C-centered cell are aligned in such a fashion. We therefore conclude
that the rhombohedral Bravais lattice is energetically preferred for two
reasons. On the one had, one minimizes the number of inter-icosohedral
triangular interactions while on the other, one minimizes the ~ term by
maintaining the proper inter-icosohedral bond angles.

The structures of the boranes have been the subject of numerous


studies and are well understood. It is now known that borane structures
follow a set of principles generally referred to as Wade's rules for electron
deficient clusters. 23 These rules state that BnHn2- dianions adopt a cluster
geometry in which the boron atoms lie at the vertices of a purely triangUlar
polyhedron. Each boron is bonded to just one hydrogen and these boron-
hydrogen bonds point radially outward from the center of the polyhedron.
The structures of three of these clusters (BsHs2-, B9H92- and BlOH102-) are
illustrated in Figure 7.
In earlier work we showed that second moment scaled Huckel
theory can be used both to understand Wade's rules in general as well as to
account for variations in specific boron-boron bond lengths. In our earlier
work however, we did not consider other more traditional methods of
electronic structure calculation. Such comparisons are important. Indeed
with modem computing capabilities, one can carry out rigorous ab-initio
Hartree-Fock calculations. This Hartree-Fock method is based on clear
physical assumptions whose strengths and weaknesses are well
catalogued. 2a,24 For example, it is well known that Hartree-Fock theory
generally produces correct ground state geometries as well as consistent
vibrational spectra. A comparison of second moment scaled energies to
Hartree-Fock values therefore allows one not just to assess the ability of
second moment scaled theory to reproduce the global minimum geometry
(which we have already done from comparison to experiment) but also to
study the shape of the electronic energy surface near this minimum.
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 87

In studying the electronic energy surface we are constrained by a


number of factors. First, Hartree-Fock calculations are highly computer
intensive. Secondly, second moment scaled theory has several natural
limitations. We have therefore restricted our study to just a few degrees of
geometrical freedom. In particular, we consider dimensionless degrees of
freedom which change neither the point group of the molecule nor the B-H
bond distances. We consider only unitless degrees of freedom due to the
breakdown of single determinantal theories (this includes both Hartree-
Fock and Hiickel theory) in correctly reproducing the electronic energy as a
function of the overall size of the chemical system. We limit ourselves to
variables which leave the point group symmetry of the molecule intact as
this not only drastically reduces the number of degrees of freedom but also
because it is generally these geometrical variables which pose the greatest
difficulty to traditional symmetry based molecular orbital analysis. Finally,
we consider only distortions in the boron network, as second moment
scaled theory is limited to distortions that do not result in excessive variable
charge transfer. As the amount of charge transfer between the boron and
hydrogen atoms depends on the boron-hydrogen bond distance, we keep
such B-H distances constant.
In Figure 7 we illustrate the geometrical quantities used to define the
pertinent degrees of freedom in BgHg2-, B9H92- and BIOH 102-. We
consider first the B 10H 102- molecule which has a D4d ground state
geometry. In BIOHI02- there are two symmetry inequivalent types of boron
atoms which lie at a distance either ra or Ib away from the center of the
cluster. The angle between ra and Ib is defined to be 9. The boron
positions in B 10H 10 2- geometry are therefore controlled by three
parameters: ra, rJIb and 9. The first parameter is size dependent while the
remaining two are dimensionless. As stated above we are interested only
with the latter parameters. In a similar manner BgHg2- and B9H92- have
respectively D2d and D3h symmetry and respectively three and two
dimensionless boron positional parameters. We therefore optimized these
parameters for these three molecules using both second moment scaled
Huckel theory (112 theory)6b and restricted Hartree-Fock theory at the STO-
3G, 3-21G and 6-31G* levels. 2a , 24 A comparison of some of these
results is given in Table II. The results are tabulated for all the symmetry
inequivalent B-B bond lengths, where the tabulated specific bond labels
refer to those illustrated in Figure 7. We also show in Table II the average
experimental X-ray structure bond lengths for these bond types. 25 In
comparing these results we see that there is significant overall agreement
between the 1l2-theory, the Hartree-Fock calculations and the X-ray crystal
structure distances. The average difference in bond distances between th~
RHF/6-31G* calculation and the experimental bond distances is O.044A
while the corresponding difference between the 1l2-theory and experiment is
o.04oA. In general the ordering of the bonds from shortest to longest is the
00
00

rb

;:0
::0
8
en
ffl
2- 2- 2- ~
BaHa BgHg B10 H10 ~
~
"
r
Figure 7 The BsH82-. B9H92- and B IOH 102- clusters. Polyhedral venices represent BH units. gJ
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 89

same for the 1l2-theory, the Hartree-Fock calculations and the X-ray
structures.
We now turn to the shape of the electronic surface of these three
molecules near the minimum energy geometries. We show in Figure 8 a
contour map of the three surfaces using 1l2-and the RHF/6-31G* theory.
(For the sake of simplicity we consider only a two-dimensional surface for
BgHg2-.) Both theoretical and experimental minima are shown in the
figure. It may be seen that there is fairly good agreement between the ab-
initio and the 112 electronic surfaces, with the best agreement found for
B9H92- and the worst for BgHg2-. There are however several major
differences between these results. First we should note that only the
Hartree-Fock theory is based on exact approximations and therefore only
for Hartree-Fock theory is it known, at least in principle, which additional
effects (such as configuration interaction) need to be included. Second,
there is a significant difference in computer time required for the Hartree-
Fock and the 112 scaled theory,with the Hartree-Fock calculations being on
the order of 103 times slower. Third, it is easiest to rationalize the
geometrical factors responsible for the shape of these curves within the
context of 1l2-theory. This is so as we have a fairly large set of useful
molecular orbital techniques including the fragment formalism and the
concept of orbital mixing which can be used to explain 1l2-Huckel
molecular orbital energies. These approaches are particularly useful in the
context of Huckel theory as it is possible to evaluate the energy of a Huckel
orbital without considering other occupied molecular orbitals (as one has to
do with Hartree-Fock theory). Furthermore, Huckel energies depend
purely on the overlap of valence atomic orbitals. Such overlap can be
deduced by visual inspection of the molecular orbital (MO) shape. Finally.
in Huckel theory one does not need to calculate directly the difference in
energy between large nuclear-nuclear or electron-electron repulsions and
large electron-nuclear attractions and one therefore obviates the need to
explain small differences between large numbers.

6. The e parameter of BIO" 10 2-


The fragment formalism and the concept of orbital mixing can be
used to account for the electronic energy surfaces of the borane dianions
described in the previous section. To illustrate the truth of this assertion,
we consider here in detail the BIOH102- ion. In the previous section (see
Figure 8) we saw that of the two dimensionless parameters in BlOHI02-
e,
only the angle variable, has a significant effect on the electronic energy.
Indeed the minimum of the contour map in Figure 8c has more the form of
a trench than a local point minimum. The slopes of this trench correspond
to changes in e. e
We therefore concentrate here on this parameter. In
90 R. ROUSSEAU AND S. LEE

a Ba Ha2- (fL2) b BaHa2- (S-3IG·)


B B
35-
30·

1.20 ra1rb

C B9H92- (fL2) d

Figure 8 Contour map of energy as a function of geometry for BsHs2-.


B9H92- and BlOHI02- from ~2-Hiickel and RHF 6-310* calculations. Contour
lines represent 0.05 au. Open circles and crosses represent respectively calculated
and experimentally observed minimum energy geometries (sec Figure 1)
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 91

Table III Comparison of £1E Sum for All MO's and Sum of
HOMO's of Each Irreducible Representation
8 ~ (All MO)8 ~ (Sum of HOMO)b
4SO 22.6 eV 24.3 eV
5SO 3.1 3.9
fi1' 0.0 0.0
6SO 2.1 3.5
7SO 21.6 31.6

a. ~ is the difference in energy between a given geometry and the 8=60°


geometry. In the middle column we sum all filled MO's.
b. In this column we sum the highest occupied molecular orbital energy of
each of the a}, b2, el, e2 and e3 irreducible representations.

3 we have drawn a Walsh diagram for the individual MO's as a function of


8 (where the remaining variable, ralfb is held constant at 1.226). It may be
seen that for 8 = 60° a large gap of MO energies appears between -13 e V
and -5 eV. This corresponds to the HOMO-LUMO gap. The HOMO-
LUMO gap is largest between 55° and 60° which agrees well with the
optimized value of 8=60.8°. However, it may be seen that changes of
energy in the HOMO alone do not quantitatively account for the changes in
total energy. The penultimate occupied molecular orbital (POMO) and other
low lying orbitals also play significant roles. A good estimate of the
relation between energy and 8 can be obtained if one analyzes the
irreducible representation label of the individual MO's. In particular, we
consider the sum of the HOMO energies of each type of irreducible
representations. As there are five types of filled MO's (the al,b2,e},e2 and
e3 representations) this sum consists of adding five separate MO energies
together. In Table ill we compare the energy of this sum with the sum of
all filled MO's. In particular we calculate differences of energy between
alternate structures using the 8=60° geometry as the reference standard. As
Table III shows there is reasonable agreement with the two columns of
energy differences. We therefore need to account for the energies of merely
these five orbitals in deducing the relation between geometry and the overall
energtics ofthe B IOH I0 2- ion.
From Figure 9 it may be seen that the evolution of the five
individual irreducible representation HOMO's may be divided into two
seperate regimes. In the first regime between 60-90°, the al,e}, and e2-
HOMO's rise in energy while the bl -LUMO falls rather sharply. In the
second regime between 30-60°, the al and el - HOMO's drop in energy
while the h2 and e3 - LUMO's rise. These changes correspond to specific
geometrical effects. As is illustrated in Figure 10, for 8=90° the molecule
has a planar octagon of boron atoms sandwiched between two apical boron
atoms. The overall point group symmetry is DSh. By contrast in the 8=30°
~
N

BIOH,02-(fL2-theory)

-17
~
::<:I
g
75° 60° 45° 30° ~
E;
~
t:I
Vl
r
Figure 9 Figures 1 and 4
~2-HUckel theory Walsh diagram for BIOHI02- as a function of O. (See
~
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 93

Figure 10 Geometry deformation of BlOHl02- as a function of 9. Open and


closed circles represent the upper and lower BSHS- mOlecular fragments.

regime the BIOHUt- cluster has divided itself into two more or less isolated
fragments each of C4v symmetry.
We consider fIrSt the 9=9()0 geometry. We find here that of the
three bond types shown originally in Figure 7 for B IOH 102-, only the a and
b bonds remain intact. The c-bond is completely broken. In terms of
symmetry, the DSh point group found at 9=90° has twice as many
symmetry elements as the original D4h point group. One may therefore
defme a new irreducible representation sub-label depending on whether the
molecular orbital is symmetric or antisymmetric with respect to the central
mirror plane of the molecule (i.e., the plane which contains the octagon of
boron atoms.) We shall use the letters (J and 1t for respectively the
symmetric and anti symmetric forms. There are comparatively few
molecular orbitals of 1t symmetry, the majority being 1t orbitals of the
central octagonal plane. In Figure 11 we illustrate those 1t - orbitals which
are relevant to our analysis. They are the al,el and e2 - LUMO's and the e3
-PUMO (for penultimate unoccupied molecular orbital). Lying relatively
near in energies to these four orbitals are the aI, el and e2 -HOMO's and
the e3 - LUMO. These latter four orbitals are all of (J- character and they
are also illustrated in Figure 12. As we allow 9 to relax from the 90°
geometry, two effects occur. First, the (J and 1t sub-labels are lost and
hence mixing between the (J and 1t sets becomes allowed. Secondly, the
role of the c and d-bonds change (the d bond becomes increasingly strong)
which leads to a corresponding change in the overall bonding character of
individual orbitals. Thus the al -HOMO and the al -LUMO which were
formerly of respectively (J and 1t character mix to form strong bonding and
antibonding combinations. The bonding combination is further
strengthened as the geometric distortion eventually converts the original1t
94 R. ROUSSEAU AND S. LEE

9 2(3), e21t-1umo ~(3), e3a-1umo ~(4), e31t-pumo

Figure 11 Pertinent HOMO's, LUMO's and PUMO's for BloHut- at 8=90".


TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 95

a 1-C4V (4): basis for a 1-D4d (4)


and b 2-D4d (4)

e -C4V (3): basis for e1-D4d(3)


and e 3-D4d (3)

e -C4v (2): basis for e 1-D4d (2)


and e 3-D4d (2)

Figure 12 BSHS- Fragment orbitals.

orbital into a CJ and x bonding one. In a similar fashion the e1 and e2 -


HOMO's mix with respectively the el and e2 - LUMO's. It is of some
interest that the same does not occur for the e3 -LUMO and PUMO. An
examination of Figure 12 shows the reason for this. These orbitals cannot
mix well as they either form a CJ-bonding and strong x-antibonding orbital
or conversely a CJ-antibonding and X-bonding orbital. It is for this reason
that there is little stabilization of the e3 -LUMO.
In rationalizing the MO diagram near 6=30°, it is easiest to use a
fragment orbital approach. The natural fragments are the two rather isolated
BSHS units illustrated in Figure 10. These fragments are of<4v symmetry.
There are three types of occupied MO's. The labels for these orbitals are
al(<4v), bl«4v) and e«4v). Upon interaction of the separate fragments
the aI«4v) orbitals combine to form aI(D4d) and h2(D4d) irreducible
representations, the bI (C4v) is the basis for the e2(D4d) irreducible
representation and the e(C4v) become the eI(D4d) and e3(D4d)
96 R. ROUSSEAU AND S. LEE

representations. We illustrate in Figure 12 the slightly bonding e(C4v)


orbital which later forms the e3(D4d) -LUMO and the eI(D4d) -HOMO as
well as the more bonding aI(C4v) orbital which transforms into the h2(D4~
- LUMO and the aI(D4d) -HOMO. Lastly. we illustrate the rather bonding
e(C4v) orbital which tranforms into the e3(D4d) -HOMO and the eI(D4d)
-POMO. It may be seen of these three orbitals only the fIrst two fragment
orbitals have their p lobes pointed facing one another. It is therefore these
two orbitals which mix to form strong bonding and antibonding
combinations. The remaining e(C4v) orbital does not divide in this dramatic
manner. . Finally it should be noted that the pair of bI (C4v) orbitals do not
interact with one another but instead form a degenerate pair of orbitals of
e2(D4d) symmetry.
The above analysis explains in a rather complete manner the role of
MO's in determing the shape of electronic energy surface. The key orbitals
which control the total energy are the aI.el.e2 and e3 -HOMO's. The al
and el -HOMO's both have a parabolic shape as one changes e from 30 to
90°. as at both 30° and 90° important orbital mixing occurs (as was
discussed above). By contrast. the e2. - HOMO energy is flat near 30° but
increases sharply at 90°. This is due to mixing being allowed by symmetry
at the high angle portion of the Walsh diagram and not the low angle
portion. Similar arguments can be used to account for the remaining b2 and
e3 -HOMO's.
We now consider the comparable analysis using Hartree-Fock
theory. This theory differs from Hiickel theory in several places. In the
Hartree-Fock theory the electronic energy is not just the sum of filled
molecular orbital energies. 24 Instead for restricted Hartree-Fock (RHF)
theory:

Ee1ec =2 I hi + I
i,j
(2 Jij - K ij ) (6)

= h· +. . .
E·I I £ ~ (2J 1J.. - K·)
1J
(7)
j
where i and j are indices of fIlled spatial orbitals. Ei are eigenvalues of the
Fock operator. hi is the electronic kinetic energy plus the electron-nuclear
attractive energy of an electron in the ith orbital. Jij and Kij are respectively
the Coulombic and exchange energies. Etot is the total energy. Eelec the sum
of all electronic kinetic energy plus electron-nuclear and electron-electron
energies and Enue is the repulsive nuclear-nuclear potential energy.
It may be seen from equations (5) - (7) that the sum of the filled
molecular orbital eigenvalues (i.e .• the sum of all Ei) is not equal to Eelec or
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 97

E tot as this sum Ej double counts the electron-electron repulsive


interactions. This is fundamentally different from HUckel theory where the
sum of the energies of all filled molecular orbitals equals Etot.
The different roles played by the HUckel and Hartree-Fock
eigenvalues has several consequences in the analysis of the calculational
results. To demonstrate these differences we now consider the relation
between electronic energy and the 9 angle discussed earlier for B IOH 102-.
In our earlier work we showed that both the 1l2-HUckel and Hartree-Fock
calculations gave similar curves for the relation between Etot and 9, with a
minimum energy for both calculations at 9=60°. In explaining the 1l2-
HUckel results we found that the HOMO-LUMO gap is largest near 60° and
that this gap was due to mixing of HOMO and LUMO orbitals of the
various different irreducible representations. This same explanation cannot
be used to account for the Hartree-Fock results. In Figure 13 we plot the
HOMO-LUMO gap energy (for Ej) for our RHF 6-31G* calculation. It
may be seen that the Hartree-Fock HOMO-LUMO gap is largest at 51°.
This differs significantly from the observed 9 angle. Hartree-Fock HOMO-
LUMO gap energies are clearly not a reliable indicators of the BlOHI02-
equilibrium geometry. In fact, interpretation of the Hartree-Fock results is
in many ways quite subtle. In Figure 14 we show the Eelec for the RHF 6-
31G* calculations as a function of 9. It may be seen that Eelec is at a
maximum for the true equilibrium value of 9=60°. It is in fact only the
inclusion of the E nuc , i.e., nuclear-nuclear repulsion energies which
converts the 9=60° geometry into an energy minimum. Natural energies
found by the RHF method, such as L hi' LEi' Enuc or Eelec do not give
individually a clear picture of the overall relation between energy and
geometrical structure. (Exceptions are the total kinetic and potential
energies which due to the virial theorem are in principal equal to E tot
multiplied by a constant.) The problem is that the individual terms of the
RHF calculation are either predominantly attractive or predominantly
repulsive. It is therefore difficult to develop a sum of portions of these
repulsive and attractive energies which conveys a fair representation of E tot.
This is in strong contrast to 1l2-HUckel theory where each molecular orbital
energy through the 1l2-scaling assumption already contains an apparently
useful combination of attractive and repulsive terms.

7. Reaction Pathways
In the previous sections of this chapter we examined the electronic
energy surface near the equilibrium geometries of BSHs2-, B9H92- and
98 R. ROUSSEAU AND S. LEE

0.6
-.. E LUMO - E HOMO

-...
:::s (6-31G*)
as 0.5
>-
en 0.4
Q)
c
w 0.3
0.2
40 50 60 70 80
e
Figure 13 HOMO-LUMO gap energies as a function of a for RHF 6-31G*
. BloH102- calculations.

-620
-.. EELEC

-...
::s (6-31G*)
as -630
>
en-
Q)
-640
C
w -650
-660~--~--~----~~~
40 50 60 70 80
e
Figure 14 RHF 6-31G* Ee1ec as a function ofa for BloH102-.
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 99

BlOHUf-. We showed that ~2 - Hiickel theory provided electronic energy


surfaces near the global energy minimum that were in reasonable accord
with ab-initio theory. In this section we consider the energy surface away
from the global minimum geometry. We examine here alternate local
minima and the pathways which connect these local minima to the ground
state global minimum. In particular we study the isomer chemistry of the
molecules BsHg2- and C2BsH102- as well as the reaction pathways which
connect these isomers. Again, we compare ~2-Huckel theory with ab-initio
calculations at various levels (STO-3G, 3-21G and 6-31G*). It should
however be noted at the outset that both the ab-initio and the ~2 calculations
will be less quantitatively accurate than the results of the preceding sections.
We consider first the BgHg2- molecule. It is known that the BgHg2.
ion in solution has only one llB NMR peak. 26 This single resonance is
inconsistent with the two inequivalent boron sites found in the equilibrium
D2d structure found by X-ray single crystal studies. It is generally believed
that there is a simple reaction mechanism which scrambles the boron atoms
of BgHg2-.4.27 Several ab-initio studies have suggested that a reasonable
intennediate in this process is the C2y geometry shown in Figure 15.4 We
have therefore optimized the BgHg2- molecule in this C2v geometry using
~2 scaled-Huckel theory. We find that the optimized C2v geometry is 7.2
kcaVmole higher in energy than the ground state 02d geometry. This
difference in energy is comparable to the 6-31G* and STO-3G energy
which are found to be 0.85 kcaVmole and 4.5 kcaVmol respectively. The
error between the ~2-scaled energy and the 6-31G* energy is therefore
slightly greater than the error between different level basis set RHF
calculations. The optimized bond distances of the 6-31G* and ~2 Huckel
calculations can be directly compared as is shown in Table IV.
We next consider a reaction pathway between the C2v and D2d
minimum geometries. As a first trial we considered the reaction pathway
where every atom moves in a linear fashion from the initial C2y geometry to
the final D2d geometry. We scale this transformation with a parameter q
such that at q=O the molecule has the ~2-Huckel optimized C2y geometry
while q=1.0 the geometry corresponds to the 02d one. In Figure 16 we
show the difference in energy along this pathway for the ~2-scaled theory
and for RHF calculations using the STO-3G, 3-21G and 6-31G* basis
sets. It may be seen that all calculations are in reasonable agreement. It
may also be seen that the barrier height calculated in Figure 16 is in excess
of 3eV. This barrier is clearly too high to be the actual pathway by which
the C2v and D2d geometries interconnect. Indeed it has been reported that at
the 6-31G**(MP2) level it is possible to find a reaction coordinate
connecting the C2y and 02d geometries where there is spontaneous
rearrangements.4c We therefore need to develop a program to find reaction
coordinates for ~2-scaled Huckel theory in order to directly compare
reaction coordinates at the ab-initio level to the ~2-scaled theory.
100 R. ROUSSEAU AND S. LEE

Figure 15 Optimized C2v geometry for BgHg2-.

Table IV Bond distances for B8882- in C2v geometry


Bond ~2-Hiickel 6-31G*
a 2.048 A 2.001 A
b 1.688 1.684
c 1.871 1.855
d 1.644 1.688
e 1.884 1.826
f 1.848 1.764
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 101

--
:::s
as
0.2


c
B H
8 8
Hueckel
6-31G*
2-

W
+ 3-21G*
<1
0.1
• STO-3G

O.OT----r--~----~~~~~~~
0.0 0.2 0.4 0.6 0.8 1.0
q
Figure 16 Energy as a function of distortion coordinate, q, between the C2y
and I>2d isomers of BgHa2-.

nido-1 0 (iv+iv) C28sH102- nido-10 (vi)


(q=O.O) =
(q 1.0)
Figure 17 The nido-lO (iv + iv) and nido-lO (vi) C2BgH102• geometries.
Large and small circles Iepl'CSCnt respectively CH and BH units.
102 R. ROUSSEAU AND S. LEE

Table V Bond distances for isomers of C2B 8H 10 2. (in C2 v


geometry).

nido 10 (vi), q =1
Bond !J.2-Hiickel 6-31G*
b 1.589 A 1.536 A
c 1.748 1.652
d 1.927 2.093
e 1.871 1.858
f 2.007 2.199
g 1.829 1.844
h 1.903 1.842

nido 10 (iv + iv), q = 0

Bond J.l.2-Huckel 6-31G*


a 1.596 A 1.556 A
b 1.718 1.709
c 1.798 1.787
d 1.698 1.717
e 1.648 1.754
f 1.740 1.796
g 1.867 1.808
h 1.797 1.817

We now tum to the C2B8H102- geometry. The only known


geometry of this molecule is a deltahedron with one open hexagonal face. 28
This known structure is illustrated in Figure 17. In the Williams
nomenclature22 this structure is a nido-lO (vi) cluster (the 10 refers to the
number of main group cluster atoms and the roman numeral vi to the
number of atoms on the open face). In studying this molecule with !J.2-
Huckel theory we found that the global minimum was indeed this nido-
lO(vi) geometry illustrated in Figure 17. However in earlier work29 we
found a second energy minimum different from that shown in Figure 17
which contained an additional C-C bond across the center of the open
hexagonal face. In the Williams nomenclature this structure is nido-lO (iv
+ iv). This as yet experimentally unobserved isomer is also illustrated in
Figure 17. We wished to see if this local minimum predicted by J.l.2-scaled
Huckel theory is also found by ab-initio methods. We therefore performed
RHF calculations at the 6-31G* level. We found that indeed the rather
unexpected geometry shown in Figure 17 is a local minimum in ab-initio
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 103

a --
:::s
as
0.2
E6-31G*

W
<:l 0.1

nido-10(vi)
nido-10(iv+iv)
~
0.0
0.0 0.2 0.4 0.6 0.8 1.0
q

b
--
0.4

:::s E~2
as 0.3

W 0.2
<:l
0.1

0.0
0.0 0.2 0.4 0.6 0.8 1.0
q

C
--
:::s
as
w
0.5

0.4

0.3
E HOMO

<:l
0.2 nido-10(iv+iv)

0.1

0.0
0.0 0.2 0.4 0.6 0.8 1.0
q
Figure 18 Energy as a function of distortion coordinate, q, between the
C2BgHl02- ~lO (iv + iv) and nido-IO (vi) geomeaies.
104 R. ROUSSEAU AND S. LEE

q=O.O q=O.4 q=1.0

Figure 19 Principal atomic orbitals in ~2-Hiickel C2BsHI02- HOMO.

theory. We compare bond distances between the ab-initio and 1l2-scaled


HUckel theory for both isomers in Table V.
In an analogous manner to our earlier study on BgHg2- we
calculated the energy as a function of a linearly interpolated reaction
pathway between the two optimized C2v C2BgH102- geometries, where we
set q = 0.0 and q = 1.0 to correspond to respectively the geometries with
and without the central carbon bond These results are shown in Figure 18.
The ab-initio and 1l2-scaled HUckel theory calculations are in fair qualitative
agreement; in both we find a local maximum near the value of q=0.40.
However the energies of the 1l2-scaled calculations are off by a factor of
two from the ab-initio results. None of the previous calculations reported
in this paper had errors of this magnitude. We believe this error lies in the
lack of Madelung energy contributions to the 1l2-scaled HUckel theory. In
the C2BSH102- molecule the formal oxidation state of the carbon atoms
changes as the central carbon-carbon bond is broken. We believe this
change in oxidation state results in an ionic energy contribution which is not
accounted for by the 1l2-HUckel theory. None of the earlier examples had a
comparable change in oxidation state.
We can readily understand the geometric origin of the 1l2-HUckel
energies. For this system the total Hiickel electronic energy is well
modelled by the changes in the HOMO energy alone. In Figure 18c we plot
just the energy of this single orbital. It reproduces quite well the full
HUckel energy plot of Figure 18b. We show the changing form of this
HOMO in Figure 19. At q=O.O the HOMO is predominately carbon-carbon
(J bonding. By contrast, the structure at q = 1.0 is stabilized by allowed
mixing of these carbon p-orbitals with unoccupied mainly boron orbital of
the same symmetry. This mixing changes the HOMO into a carbon to
boron 1t bonding MO. In the intennediate geometry there is neither a strong
C-C (J-bond nor a C-B 1t bond. It is for this reason that there is a
maximum energy of the value of q=O.4_ Finally we note it would be of
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 105

interest to find chemical variations which will stabilize the as yet unknown
nido-to (iv + iv) geometry.
8. Conclusion

In many ways these last results show succinctly the advantages and
disadvantages of the 112 scaled technique when compared to ab-initio
theory. One of the advantages is that because the 112 method uses Hiickel
theory we can readily understand on a qualitative level the precise electronic
factors which influence the total energy. A second advantage is that 112-
HUckel theory can be carried out quickly and at low cost. This low cost
allows one a great lattitude in the number of geometries one chooses to
study. Our results for C2BgH102- suggest that 1l2-HUckel theory can be
used to find potential new isomers which can then be explicitly tested at a
more accurate ab-initio level. The disadvantage of the 1l2-HUckel theory is
its incomplete modeling of the various factors which control the electronic
energy. For example, 1l2-HUckel theory does not as yet contain terms
which model the relation between charge transfer or ionic energies and
structure. This limits the applications of the method for non-covalent
systems. We believe that in the end a combination of both approaches leads
to the clearest picture of the bonding in the boranes as well as other covalent
and nearly covalent compounds. Hartree-Fock calculations will allow the
chemist to assess the full electronic energy. By contrast 1l2-Hiickel theory
will let one measure the pure covalent forces. It in turn will form a bridge
to such qualitative molecular orbital ideas such as the fragment formalism,
symmetry analysis and the isolobal analogy. With these tools the chemist
can form a vivid and accurate picture of the bonding in both molecules and
solids.

Acknowledgemen ts

This research was supported by funds from the Petroleum Research


Fund. The research would not have been possible without the computer
programs developed by R. Hoffmann, M. -H Whangbo, M. Evain, T.
Hughbanks, S. Wijeyesekera, M. Kertesz, C. N. Wilker, C. Zheng, J. K.
Burdett and G. Miller.
106 R. ROUSSEAU AND S. LEE

References
1. Wigner, E.P. and Seitz, F. in Solid State Physics. edited by Seitz,
F. and Turnbull, D. (Academic, New York, 1955), Vol. 1, p. 97.
2. a) Hehre, W. M., Radom, L., Schleyer, P. v. R. and Pople, J. A.
Ab Initio Molecular Orbital Theory.. Wiley-Interscience
Publications, New York, 1986 b) Hafner, J. From Hamiltonians to
Phase Diagrams, Springer-Verlag, New York, 1987.
3. a) Woodward, R. B., and Hoffmann, R. The Conservation of
Orbital Symmetry VCH, New York, 1970. b) Albright, T. A.,
Burdett, J. K. and Whangbo, M. H. Orbital Interactions in
Chemistry, Wiley - Interscience Publications, Toronto, 1985.
4. a) Wales, D. J., Bone, R. G. A. J. Amer. Chern. Soc., 1992,
114, 5394 b) Bausch, J. W., Surga Prakash, G. K., Williams, R.
E. Inor~. Chern., 1992,31, 3763 c) Biihl, M., Mebel, A. M.,
Charkin, O. P., Schleyer, P. v. R. Inor~. Chern. 1992,31,3769.
5. a) This method was proposed independently for AB (main group-
transition metal) by Pettifor, V and Podloucky, R. Phys. Rev. Lett.
1984,53, 1080 and for the Peierls distortion by Burdett, J. K. and
Lee, S. J. Am. Chern. Soc. 1985, J07, 3063. Other papers whose
results, are based on this method include b) Pettifor, D. G. 1. Phys.
C 1986,19, 285 c) Cressoni, J. C. and Pettifor, D. G. J. Phys.
Condens. Matter. 1991,3,495, and the references cited below in
Ref. 6.
6. a) Lee,S. J. Am. Chern Soc. 1991,113,101; 1991,113,8611
b) Hoistad, L. M., Lee, S. and Pasternak, J. illliL.1992, 114,
4790 c) Hoistad, L. M. and Lee, S. illliL.1991, 113,8216 d)
Lee, S. Acc. Chern. Res. 1991,24, 249 e) Lee, S. Inor~. Chern.
1992,31, 3063 t) Lee, S., Hoistad, L. M. and Carter, S. T. New
J. Chern., 1992, 16, 651.
7. a) Chadi, D. J. Phys. Rev. B., 1979, 19, 2074 b) Chadi, D. J.
Phys. Rev. B., 1984,29, 785 c) Harrison, W. A. Phys. Rev. B.,
1981,24, 385 d) Wang, W. R. and Duke, C.S. Phys. Rev. B.,
1987,36, 2736 e) Verges, J. A. Yndurain, F. Phys. Rev. B.,
1988,34, 4333 t) Chadi, P. J. Phys. Rey. Lett. 1978,41, 1062.
8. Foulkes, W. M. C. and Haydock, R. Phys. Rev. B., 1989, 12,
520.
9. See discussion in Phillips, L. S. G. and Williams, R. 1. P.
Inof!~anic ChemistrY Vol. I., Oxford University Press, New York
1965.
10. Heine, V., Robertson, I. J. and Payne, M. C. in Bonding and
Structure of Solids, edited by Haydock, R., Inglesfield, J. E. and
Pendry, J. B. Royal Society, London, 1991.
11. J. Friedel, Adv. Phys. 1954,3, 446 F. Cyrot-Lackmann, L
~ 1970, CC1 67.
12. Pettifor, D. G. J. Phys., 1986, 19, 285.
13. Wolfsberg, M. and Helmholz, L. J. Chern. Phys. 1957,20,83.
14. Many important atomic parameters are used and discussed in a)
Hoffmann, R. J. Chern. Phys.1963,39, 1397 Anderson, A. B.
and Hoffmann, R. i..b..tiL,1974, 60, 4271 b) Rossi, A. R. and
Hoffmann, R. Inor~. Chern. 1975,14, 365 c) Hay, P. J.,
TOPOLOGICAL CONTROL OF MOLECULAR ORBITAL THEORY 107

Thibeault, 1. C. and Hoffmann, R. J. Am. Chern. Soc. 197597,


4884 d) Elian, M. and Hoffmann, R. Inor~. Chern. 1975, 14,
1058 e) Summerville, R. H. and Hoffmann, R. J. Am. Chern.
~ 1976, 98, 7240 f) Lauher, J. W. and Hoffmann, R. illlil...
1976,98, 1729 g) Komiya, S. Albright, T. A. and Hoffmann, R.
Inor~. Chern. 1978, 17, 126 h) Hughbanks, T., Hoffmann,
R.,Whangbo, M.-H., Stewart, K., Eisenstein, O. and CanadeU, E.
J. Am. Chern. Soc. 1982,104,3876 i) Thorn, D. and Hoffmann,
R. Inor~. Chern. 1978, 17, 126 .
15. Clementi, E. and Roetti, C. At. Data Nucl. Data Tables 1974, 14,
177 Mann, J. B. Atomic Structures Calculations, 1: Hartree-Fock
Energy Results for Elements Hydrogen to Lawrencium
(Clearinghouse for Tech. Lit., Springfield, 1967).
16. Burdett, J.K. and Lee, S. J. Amer. Chern. Soc. 1985,107, 3050
and 3065.
17. Coulson, C.A. and Rushbrooke, G.S. Proc. Cambrid~e Philos.
~ 1940, 36, 193.
18. An earlier report of this work, without the decompostion into
moments is given in Lee, S, J. Amer. Chern. Soc. 1991,113,
8611. A similar report is given in Cressoni, J. C. and Pettifor, D.
G. 1. Phys. C. submitted for pUblication.
19. Donohue, J. The Structures of the Elements (Wiley, New York,
1974).
20. The location of the nodes in the difference of energy curves are
sensitive to the values of Eu and El. A tail in the DOS running to
the left or right shifts the nodes in the direction of the tail. See
discussion of this in J. Amer Chern. Soc. 1988, 110, 8000.
21. This work originally appeared as Lee, S., Rousseau, R. and Wells,
C. Phys. Rey. B. 1992,46, 12121.
22. Williams, R. E. In Electron Deficient Boron and Carbon Clusters,
Olah, G. A., Wade, K., Williams, R. E. Eds., Wiley, New York,
1991.
23. a) Wade, K. Adv. Inor~. Chern. Radiochem. 1976,18, 1 b)
Rudolph, R. W. ; Pretzer, W. R. Inor~. Chern. 1972,11, 1974 c)
Williams, R. E. Inor~. Chern. 1971, 11, 210 d) Stone, A. J.
Inor~. Chern. 1981,20, 563 e) Mingos, D.M. P. Acc. Chern.
fu<.s...1984,17, 311 f) Lipscomb, W. N. Boron Hydrides,;.
Benjamin: New York, 1963 g) Muetterties, E. L., Knoth, W. H.
Polyhedral Boranes: Wiley: New York, 1968 h) Buehl, M.; P.
von R. Schleyer, In Electron Deficient Boron and Carbon Clusters:
Olah, G. A., Wade, K., Williams, R. E. Eds.; Wiley: New York,
1991; p113 i) McKee, M. L. J. Am. Chern. Soc. 1992, 114, 879
j) BUhl, M.; Schleyer, P. v. R. J. Am. Chern. Soc. 1992,114,
477 k) McKee, M. L. J. Am. Chern. Soc. 1991, 113, 9448.
24. Szabo, A., Ostlund, N. S. Modern Quantum Chemistry, McGraw-
Hill, New York, 1982.
25. a) BSHs2-: Guggenberger, L. J. Inorg. Chern. 1969,8,2771 b)
B9H92-: Guggenberger, L. J. Inorg. Chern. 1968, 7, 2261 c)
BlOH102-: Gill, J. T.; Lippard, S. J. Inor~. Chern. 1975,14,751.
26. Muetterties, E. L., Wiersema, R. J., Hawthorne, M. F. J. Am.
Chern. Soc., 1973,95, 7520.
108 R. ROUSSEAU AND S. LEE

27. a) Mingos, D. P. M. and Wales, D. J. In Electron Deficient Boron


and Carbon Clusters~ Olah, G. A., Wade, K., R. E.Williarns Eds.,
Wiley, New York, 1991 b) Rodgers, A. and, Johnson, B. F. G.
Polyhedron, 1988, 7, 1107.
28. Stibr, B., Janousek, Z., Base, K., Hermanek, S., Plesek, J. and
Zakharova, I. Collect Czech. Chern. Cornrnun. 1984,49, 1891.
29. Lee, S. Inor2. Chern., 1992,31, 3063.
POL YHEDRAL DYNAMICS

R. B. KING
Department of Chemistry
University of Georgia
Athens, Georgia 30602, USA

1. Introduction

The concept of a polyhedron is a useful way for describing diverse chemical


structures. In such a context a polyhedron may be regarded as a set consisting of (zero-
dimensional) points, namely its vertices; (one-dimensional) lines connecting some of the
vertices, namely its edges; and (two-dimensional) surfaces formed by the edges, namely its
faces. Polyhedra can appear in chemical structures as coordination polyhedra in which the
vertices represent ligands surrounding a central atom which is often, but not always, a
metal, and cluster polyhedra in which the vertices represent multivalent atoms and the edges
represent bonding distances. Deltahedra. in which all faces are triangles, are a special type
of polyhedra which appear often in chemical contexts. Chemically significant deltahedra
are depicted in Figure 1. The topology of a polyhedron can be described by a graph, called
the I-skeleton of the polyhedron.1 The vertices and edges of the I-skeleton correspond to
the vertices and edges, respectively, of the underlying polyhedron.
The role of polyhedra in the static description of chemical structures makes the
dynamic properties of polyhedra also of considerable interest. This chapter summarizes
some topological and graph-theoretical aspects of polyhedral dynamics, which relate to the
changes in the configuration of atoms in a polyhedral molecule as a function of time. The
central concept in the study of polyhedral dynamics is that of a polyhedral isomerization.
In this context a polyhedral isomerization may be defined as a deformation of a specific
polyhedron PI until its vertices define a new polyhedron PJ.. Of particular interest are
sequences of two polyhedral isomerization steps PI ~p]'~PJ in which the polyhedron 13 is
combinatorially equivalent to the polyhedron PI although with some permutation of its
vertices not necessarily the identity permutation. In this sense two polyhedra PI and P2
may be considered to be combinatorially equivalent l whenever there are three one-to-one
mappings 0/, 'E, and .rfrom the vertex, edge, and face sets of PI to the corresponding sets
of P]. such that incidence relations are preserved. Thus if a vertex, edge or face a of PI is
incident to or touches upon a vertex, edge, or face !3 of PI, then the images of a and !3
under 0/, 'E, or .r are incident in p].. 2
109
D. Honchev and O. Mekenyan (eds.). Graph Theoretical Approaches to Chemical Reactivity. 109-\35.
© 1994 Kluwer Academic Publishers.
110 R. B. KING

Tetrahedron Trigonal Bipyramid


Capped Octahedron
Deltahedra with Tetrahedral Chambers (Degree 3 Vertices)

Octahedron Pentagonal Bipyramid Bisdisphenoid


("02d Dodecahedron")

3,3,3-Tricapped Trigonal Prism 4,4-Bicapped Square Antiprism

Edge-coalesced Icosahedron Regular Icosahedron

Figure 1: Some chemically significant deltahedra.


POLYHEDRAL DYNAMICS III

Consider a polyhedral isomerization sequence PI ~12~13 in which Pl and 13 are


combinatorially equivalent. Such a polyhedral isomerization sequence may be called a
degenerate polyhedral isomerization with 12 as the intermediate polyhedron. Structures
undergoing such degenerate isomerization processes are often called fluxional. 3 A
degenerate polyhedral isomerization with a planar intermediate "polyhedron" (actually a
polygon) may be called a planar polyhedral isomerization. The simplest example of a
planar polyhedral isomerization is the interconversion of two enantiomeric tetrahedra (PI
and 13) through a square planar intermediate 12. Except for this simplest example, planar
polyhedral isomerizations are unfavorable owing to excessive intervertex repulsion.
Polyhedral isomerizations may be treated from either the macroscopic or
microscopic points of view. The earliest work in this area pioneered by Muetterties,4,5,6,7
Gielen,8,9,10,11 ,12,13,14,15 Musher,16,17 Klemperer,18,19,20 and Brocas21 ,22 focussed on
the macroscopic picture, namely relationships between different permutational isomers.
Such relationships may be depicted by reaction graphs called topological representations in
which the vertices correspond to different permutational isomers and the edges correspond
to single degenerate polyhedral isomerization steps. Subsequent work treats the
microscopic picture in which the details of polyhedral topology are used to elucidate
possible single polyhedral isomerization steps, namely which types of isomerization steps
are possible. Both approaches will be reviewed in this chapter, which expands and updates
a review published by the author in 1988. 23 In addition, many of the ideas presented in
this chapter are also discussed in a book recently published by the author. 24

2. The Topology of Polyhedra

Before considering polyhedral dynamics it is first necessary to consider the static


topology of polyhedra. Of fundamental importance are relationships between possible
numbers and types of vertices (v), edges (e), and faces if) of polyhedra. In this connection
the following elementary relationships are particularly significant25:

1. Euler's relationship:
v-e+f=2 (1)
This arises from the properties of ordinary three-dimensional space.

2. Relationship between the edges and faces:

v-I
}:.ifi = 2e (2)
i=3

In equation 2 f; is the number of faces with i edges (Le.,!3 is the number of triangular
faces,f4 is the number of quadrilateral faces, etc.). This relationship arises from the fact
that each edge of the polyhedron is shared by exactly two faces. Since no face can have
fewer edges than the three of a triangle, the following inequality must hold in all cases:
VS~ m
112 R.B.KING

3. Relationship between the edges and vertices:

v-I
})Vi = 2e (4)
i=3

In equation 4 Vi is the number of vertices of degree i (i.e., having i edges meeting at the
vertex). This relationship arises from the fact that each edge of the polyhedron connects
exactly two vertices. Since no vertex of a polyhedron can have a degree less than three, the
following inequality must hold in all cases:
~~~ W
4. Totality of faces:

v-I
Lfi =f (6)
i=3

5. Totality of vertices:

v-I
LVi = v (7)
i=3

Equation 6 relates theJ;'s tofand equation 7 relates the Vi'S to v.


In generating actual polyhedra, the operations of capping and dualization are often
important. Capping a polyhedron Pl consists of adding a new vertex above the center of
one of its faces g:l followed by adding edges to connect the new vertex with each vertex of
g:l. This capping process gives a new polyhedron 12 having one more vertex than pt. If a
triangular face is capped, the following relationships will be satisfied where the subscripts
1 and 2 refer to Pl and P2, respectively: V2 = v 1 + 1; e2 = e 1 + 3; fz =ft + 2. Such a
capping of a triangular face is found in the capping of a tetrahedron to form a trigonal
bipyramid, i.e.:

capping triangular face


Tetrahedron Trigonal Bipyramid


v = 4, e = 6, f = 4 v =5, e =9, f =6

In general if a face withfk edges is capped, the following relationships will be satisfied:
V2 =Vl + 1; e2 =el + fk;f2 =fl + fk - 1. An example of such a capping process converts
a square antiprism into a capped square antiprism, i.e.
POLYHEDRAL DYNAMICS 113

capping square face

Square Antiprism Capped Square Antiprism


v=8,e=16,f=10 v = 9, e = 20, f = 13

A given polyhedron P can be converted into its dual pi< by locating the centers of the faces
of pi< at the vertices of P and the vertices of pi< above the centers of the faces of P. Two
vertices in the dual pi< are connected by an edge when the corresponding faces in P share an
edge. An example of the process of dualization is the conversion of a trigonal bipyrarnid
into a trigonal prism, i.e.

dualization

Trigonal Bipyramid Trigonal Prism


v = 5, e = 9, f = 6 v = 6, e = 9, f = 5
D3h symmetry D3h symmetry

The process of dualization has the following properties:


1. The numbers of vertices and edges in a pair of dual polyhedra P and pi< satisfy the
relationships v* =I, e* = e, f* = v, in which the starred variables refer to the dual
polyhedron pi<. Thus in the case of the trigonal bipyramid (P)/trigonal prism(pi<) dual pair
depicted above v* = I =6, e* = e = 9,f* =v = 5.
2. Dual polyhedra have the same symmetry elements and thus belong to the same
symmetry point group. Thus in the example above both the trigonal bipyramid and the
trigonal prism have the D3h symmetry point group.
3. Dualization of the dual of the polyhedron leads to the original polyhedron.
4. The degrees of the vertices of a polyhedron correspond to the number of edges in the
corresponding face polygons in its dual.
Johnson and collaborators 26 have discussed deltahedral growth based on edge
removal followed by face capping (Figure 2). Thus consider a tetrahedron. Breaking any
one edge of the tetrahedron (e.g., edge 12 in Figure 2) leads to a butterfly, which
corresponds to a trigonal bipyramid with one edge removed. Thus capping this butterfly
with a new vertex with appropriate distortions of the edges leads to the trigonal bipyrarnid.
Similarly breaking an equatorial edge in the trigonal bipyramid (e.g., edge 45 in Figure 2)
with appropriate distortions of the edges leads to a square pyramid. Capping the square
face (i.e., face 1245 in Figure 2) of the square pyramid with a new vertex leads to the
114 R. B. KING

octahedron. Such processes can be continued to generate all of the deltahedra in Figure 1
with up to ten vertices (i.e., the bicapped square antiprism). However, there are
discontinuities in this deltahedral growth sequence for the deltahedra with eleven and
twelve vertices depicted in Figure 1.

3/1~
"-2/
1
Breaking
12-edge
.. 1\
\/
Capping ..
1234-face
L~
3\J
2 2
Tetrahedron Butterfly Trigonal Bipyramid

1
/\\ Breaking
.. Capping .-
3~ 45-edge 1245-face

2
Trigonal Bipyramid Square Pyramid Octahedron

Figure 2: Deltahedral growth processes involving edge removal followed by face capping
as applied to the tetrahedron and the trigonal bipyramid

Polyhedra are often depicted as two-dimensional "perspective" drawings as aids to


help visualize the actual three-dimensional structures. For more complicated polyhedra
these two-dimensional perspective drawings begin to have limitations since their drawing
requires skill and reconstruction of their original three-dimensional picture requires
considerable imagination. These difficulties can be minimized by the use of Schlegel
diagrams 1,27 rather than conventional perspective drawings to depict three-dimensional
polyhedra in two dimensions. Schlegel diagrams are well-known to mathematicians
studying polyhedra and higher dimension polytopes but are relatively unfamiliar to
chemists.
In order to obtain a Schlegel diagram of a polyhedron P, select any face of Pas the
base/ace, 10. The plane containing the base face 10 separates three-dimensional space into
two half-spaces, one of which contains the entire volume of P. Select a point xo in the
other half-space. Draw a straight line from xo to each of the vertices of P. Each such line
will intersect the plane of JCo at a point representing the corresponding vertex. Connect a
pair of vertex projections onto the plane of JCo with straight lines if and only if the
corresponding vertices of Phave an edge between them. This process leads to a projection
POLYHEDRAL DYNAMICS 115

of the three-dimensional plane of the face 10; this projection is called the Schlegel diagram
of the polyhedron P.
Any given polyhedron can have as many different Schlegel diagrams as it has
different faces. The procedure for drawing the Schlegel diagram of the square pyramid
using the square face as the base face 10 is illustrated below.

,
Xo
The following features of Schlegel diagrams are of interest:
(1) The location of the point xo can always be chosen so that the edges in the Schlegel
diagram can be drawn as non-intersecting straight lines. This is one of the big advantages
of Schlegel diagrams over conventional perspective drawings.
(2) Schlegel diagrams depict the topological but not the metric features of polyhedra.
Thus the vertex neighborhood relationships depicted by edges are preserved. However,
edge lengths and angles are distorted. Since many important chemical relationships are
topological rather than metric, this distortion is not necessarily serious.
(3) Schl~el diagrams may not preserve all symmetry elements of the original
polyhedron because of the metric distortion. The preservation of symmetry elements in
Schlegel diagrams is maximized if a unique face of the polyhedron is selected as the base
face.
The problem of the classification and enumeration of polyhedra is a complicated
one. Thus there appear to be no formulas, direct or recursive, for which the number of
combinatorially distinct polyhedra having a given number of vertices, edges, faces, or any
given combination of these elements can be calculated. 28 ,29 Duijvestijn and Federico have
enumerated by computer the polyhedra having up to 22 edges according to their numbers of
vertices, edges, and faces and their symmetry groups and present a summary of their
methods, results, and literature references to previous work.30 Their work shows that
there are 1,2, 7, 34, 257, 2606, and 32,300 topologically distinct polyhedra having 4,5,
6,7,8,9, and 10 faces or vertices, respectively. Tabulations are available for all 301 ( =
1 + 2 + 7 + 34 + 257) topologically distinct polyhedra having eight or fewer faces 31 or
eight or fewer vertices. 32 These two tabulations are essentially equivalent by the
dualization relationship discussed above.
Polyhedra of greatest significance in coordination chemistry are those that can be
formed by the nine orbitals of the sp 3d 5 valence orbital manifold accessible to transition
metals. There are, however, some polyhedra having fewer than nine vertices which cannot
be formed by these nine orbitals; such polyhedra are called forbidden polyhedra. 33 Group
theoretical arguments show that polyhedra of the following types are always forbidden
polyhedra:
(1) Polyhedra having eight vertices, a direct product symmetry group R x Cs or R x Cj
(R contains only proper rotations) and the plane in Cs fixing either 0 or 6 vertices;
(2) Polyhedra having a six-fold or higher CIl rotation axis.
116 R. B. KING

Chemically significant forbidden polyhedra include the seven-vertex hexagonal bipyramid


and the eight-vertex cube, D3d bicapped octahedron, D3h 3,3-bicapped trigonal prism, and
hexagonal bipyramid.

3. Polyhedral Isomerizations

Consider an MLn compound having n ligands or a cluster compound having n


vertices. There are a total of n! permutations of the ligand sites or the cluster vertices.
These permutations form a group of order n! called the symmetric group34,35 and
conventionally designated as Sn or less conventionally as P n (to avoid confusion with
improper rotations 36 also designated as Sn). The symmetric group Sn is the automorphism
group reflecting the symmetry of the complete graph Kn, which consists of n vertices with
an edge between every pair of vertices for a total of n(n - 1)12 edges.
Now consider the symmetry point group G (or, more precisely, the framework
group37) of the above MLn coordination compound of n-vertex cluster compound. This
group has IGI operations of which IRI are proper rotations so that IGI/IRI = 2 if the
compound is achiral and IGI/IRI = 1 if the compound is chiral (i.e., has no improper
rotations). The n! distinct permutations of the n sites in the coordination compound or
cluster are divided into n!/IRI right cosets38 which represent the permutational isomers
since the permutations corresponding to the IRI proper rotations of a given isomer do not
change the isomer but merely rotate it in space. This leads naturally to the concept of
isomer count, f, namely
f = n!/IRI (8)
if all vertices are distinguishable. Similarly the quotient
E = n!/IGI =f12 (9)
for a given chiral polyhedron corresponds to the number of enantiomeric pairs. The isomer
count f indicates the complexity of macroscopic models for polyhedral isomerizations such
as topological representations.

4. Microscopic Models: Diamond-Square-Diamond Processes and Gale


Diagrams

Microscopic approaches to polyhedral rearrangements dissect such processes into


elementary steps. The most important elementary step is the diamond-square-diamond
process which was first recognized in a chemical context by Lipscomb in 196639 as a
generalization of a process proposed earlier by Berry40 for rearrangements of five-vertex
trigonal pyramids. Such a diamond-square-diamond process or "dsd process" in a
polyhedron occurs at two triangular faces sharing an edge and can be depicted as follows:

A
/\
C-O
\/
B
PI P3
POLYHEDRAL DYNAMICS 117

In this process a configuration such as PI can be called a dsd situation and the edge AB can
be called a switching edge. If a, b, c, and d are taken to represent the degrees of the
vertices A, B, C, and D, respectively, in PI, then the dsd type of the switching edge AB
can be represented as ab(cd). In this designation the first two digits refer to the degrees of
the vertices joined by AB but contained in the faces (triangles) having AB as the conunon
edge (Le., C and D in PI). The quadrilateral face formed in structure P2 may be called a
pivot/ace.
In his pioneering paper Lipscomb 39 described some possible framework
rearrangements of the polyhedra found in cage boranes and carboranes having from five to
twelve vertex atoms. Fifteen years later4 1 I reexamined this question in light of advances
in known experimental information on polyhedral chemical systems as well as improved
understanding in polyhedral topology. Subsequently42 I developed a mathematical
approach for examining all possible non-planar rearrangements of polyhedra having few
(i.e., ::;; 6) vertices using a method developed by Gale 43 in 1956 for studying
d-dimensional polytopes having only a few more than the minimum d + 1 vertices. This
work 42 confirmed the crucial role of dsd-processes conjectured so successfully by
Lipscomb 39 and also provided insight for more detailed study of isomerizations of
polyhedra having seven 44 and eight45 vertices.
Consider a polyhedron having e edges. Such a polyhedron has e distinct dsd
situations, one corresponding to each of the e edges acting as the switching edge.
Applications of the dsd process at each of the dsd situations in a given polyhedron leads in
each case to a new polyhedron. In some cases the new polyhedron is identical to the
original polyhedron. In such cases the switching edge can be said to be degenerate. A dsd
process involving a degenerate switching edge represents a pathway for a degenerate
polyhedral isomerization of the polyhedron. A polyhedron having one or more degenerate
edges is inherently fluxional whereas a polyhedron without degenerate edges is inherently
rigid.
The dsd type of a degenerate edge ab(cd) can be seen by application of the process
PI-'>P2-'>P3 to satisfy the following conditions:
c = a-I and d = b - I or c = b - 1 and d = a-I (10)
Using these conditions the chemically significant deltahedra depicted in Figure 1 can be
very easily checked for the presence of one or more degenerate edges with the following
results:
(1) Tetrahedron. No dsd process of any kind is possible since the tetrahedron is the
complete graph K4. A tetrahedron is therefore inherently rigid.
(2) Trigonal bipyramid. The three edges connecting pairs of equatorial vertices are
degenerate edges of the type 44(33). A dsd process using one of these degenerate edges as
the switching edge and involving a square pyramid intermediate corresponds to the Berry
pseudorotation 40 ,46 which is believed to be the mechanism responsible for the
stereochemical nonrigidity of trigonal bipyramidal complexes, even at relatively low
temperatures. 47 The single dsd process for the trigonal bipyramid may be depicted as
follows:
118 R.B. KING

.. ..
Trigonal bipyramid - Square pyramid --Trigonal bipyramid

Note that the trigonal bipyramid rotates through 90° upon rearrangement through square
pyramid intermediate as a result of the C4 axis in the square pyramid. This is why this
process has been called a pseudorotation.
(3) Octahedron. The highly symmetrical octahedron has no degenerate edges and is
therefore inherently rigid.
( 4) Pentagonal bipyramid. The pentagonal bipyramid has no degenerate edges and
thus by definition is inherently rigid. However, a dsd process using a 45(44) edge of the
pentagonal bipyrarnid (namely an edge connecting an equatorial vertex with an axial vertex)
gives a capped octahedron. The capped octahedron is a low energy polyhedron for ML7
coordination complexes48 but a forbidden polyhedron for boranes and carboranes because
of its tetrahedral chamber. 49
(5) Bisdisphenoid. The eight-vertex bisdisphenoid has four pairwise degenerate
edges, which are those of the type 55(44) located in the subtetrahedron consisting of the
degree 5 vertices of the bisdisphenoid (Figure 1). Thus two successive or more likely
concerted (parallel) dsd process involving opposite 55(44) edges (Le., a pair related by a
C2 symmetry operation) converts one bisdisphenoid into another bisdisphenoid through a
square antiprismatic intermediate. Thus a bisdisphenoid, like the trigonal bipyramid
discussed above, is inherently fluxional.
(6) 4,4,4-Tricapped Trigonal Prism. The three edges of the type 55(44) corres-
ponding to the "vertical" edges of the trigonal prism are degenerate. A dsd process using
one of these degenerate edges as the switching edge involves a C4v 4-capped square
antiprism intermediate. Nine-vertex systems are therefore inherently fluxional.
(7) 4,4-Bicapped Square Antiprism. This polyhedron has no degenerate edges
and therefore is inherently rigid.
(8) Edge-coalesced Icosahedron. The four edges of the type 56(45) are
degenerate. This eleven-vertex deltahedron is therefore inherently fluxional.
(9) Icosahedron. This highly symmetrical polyhedron, like the octahedron, has no
degenerate edges and is therefore inherently rigid.
This simple analysis indicates that in deltahedral structures the 4, 6, to, and 12
vertex structures are inherently rigid; the 5, 8, 9, and 11 vertex structures are inherently
fluxional; and the rigidity of the seven-vertex structure depends upon the energy difference
between the two most symmetrical seven-vertex deltahedra, namely the pentagonal
bipyramid and the capped octahedron. This can be compared with experimental
fluxionality observations by boron-II nuclear magnetic resonance on the deltahedral borane
anions BnHn2- (6 ~ n ~ 12)50 where the 6, 7, 9, to, and 12 vertex structures are found to
be rigid and the 8 and 11 vertex structures are found to be fluxional. The only discrepancy
between experiment and these very simple topological criteria for fluxionality arises in the
nine vertex structure B9H92-.
POLYHEDRAL DYNAMICS 119

The discrepancy between the predictions of this simple topological approach and
experiment for B9H92- has led to the search for more detailed criteria for the rigidity of the
deltahedra boranes. In this connection Gimarc and Ott have studied orbital symmetry
methods particularly for the five,51 seven,52 and nine 53 vertex borane and carborane
structures. A topologically feasible dsd process is orbitally forbidden if crossing of
occupied and vacant molecular orbitals (i.e., a "HOMO-LUMO crossing") occurs during
the dsd process as illustrated by the following diagram for the single dsd process for the
trigonal bipyrarnid54 :

LUMO a1,b2 ~:::=:~~~--~2---------~1--~~::::: a1,b1

...... ... -----


........- .. ....
-~--~
...... ...... .. ....
a2,b 1 _- - - - - - ~~~~~~~ a2,b2
HOMO =::--- ------ --------------------::=

Trigonal bipyramid ~ Square pyramid --Trigonal bipyramid

For such an orbitally forbidden process, which occurs in the five- and nine-vertex
deltahedral boranes and carboranes, the activation barrier separating initial and final
structures is likely to be large enough to prevent this polyhedral isomerization. However,
the forbidden dsd polyhedral rearrangement for the five-vertex B5H52- and corresponding
carboranes is allowed and has been observed for PX5 derivatives such as pel5 and PF5
(i.e., the single fluorine-19 resonance in PFS). Guggenberger and Muetterties55 point out
that cage framework rearrangements such as those in the deltahedral boranes and
carboranes involve bond stretches which must require more energy than bond angle
changes that occur in coordination polyhedra of ligands bound to a central atom.
Some selection rules have been proposed for distinguishing between symmetry-
allowed and symmetry-forbidden processes in deltahedral boranes, carboranes, and related
structures. Thus Wales and Stone56 distinguish between symmetry-allowed and
symmetry-forbidden processes by observing that a HOMO-LUMO crossing occurs if the
proposed transition state has a single atom lying on a principal en
rotational axis where
n ~ 3. A more detailed selection rule was observed by Mingos and Johnston. 57 If the four
outer edges of the two fused triangular faces (i.e., the "diamond") are symmetry
equivalent, then a single dsd process results in a pseudorotation of the initial polyhedrOn by
90° as follows:
120 R. B. KING

<I>
C 2v
..
~
C 4v
..
~C 2v

However, if the edges are not symmetry. equivalent then the rearrangement results in a
pseudorejlection of the initial polyhedron which can be indicated as follows:

..
o ..
C2 C 2v C2
Pseudorotations are symmetry-forbidden and have larger activation energies than
pseudoreflections, which are symmetry allowed.
Gale diagrams provide an elegant method for the study of microscopic aspects of
rearrangements of polyhedra with relatively few vertices (i.e., for v ~ 6) by reducing the
dimensionality of allowed vertex motions. In a chemical context Gale diagrams can be
used to study possible rearrangements of six-atom structures by depicting skeletal
rearrangements of six atoms as movements of six points on the circumference of a circle or
from the circumference to the center of the circle subject to severe restrictions that reduce
possible such movements to a manageable number. 58
Consider a polytope Pin d-dimensional space 9{d. The minimum number of
vertices of such a polyhedron is d + 1 and there is only one such polyhedron, namely the
d-simplex. 1 The combinatorially distinct possibilities for polytopes having only d + 2 and
d + 3 vertices (polyhedra with "few" vertices) are also rather limited and through a Gale
transformation59 can be represented faithfully in a space of less than d dimensions. More
specifically, if Pis a d-dimensional polytope with v vertices, a Gale transformation leads to
a Gale diagram of Pconsisting of v points in (v-4-1)-dimensional space 9{d-l in one-to-
one correspondence with the vertices of P. From the Gale diagram it is possible to
determine all of the combinatorial properties of P such as the subsets of the vertices of P
that derme faces of P, the combinatorial types of these faces, etc. Of particular significance
in the present context is tho fact that the combinatorial properties of a polytope Pwhich can
be determined by the Gale diagram include all possible isomerizations (rearrangements) of
Pto other polytopes having the same number of vertices and imbedded in the same number
of dimensions as P. Also of particular importance is the fact that, if v is not much larger
than d (Le., if v ~ 2d), then the dimension of the Gale diagram is smaller than that of the
original polytope P.
Now consider polyhedra in the ordinary three-dimensional space of interest in
chemical structures (i.e., d = 3). Gale diagrams of five- and six-vertex polyhedra can be
imbedded into one- or two-dimensional space, respectively, thereby simplifying analysis of
their possible vertex motions leading to non-planar polyhedral isomerizations of these
polyhedra of possible interest in a chemical context.
POLYHEDRAL DYNAMICS 121

In order to obtain a Gale diagram for a given polyhedron, the polyhedron is ftrst
sUbjected to a Gale transformation. Consider a polyhedron with v vertices as a set of v
points XI, ... , Xv in three-dimensional space ~3. These points may be regarded as three-
dimensional vectors XII =V:1I,l, XII,2,xIl,3), 1 ~ n ~ v, from the origin to the vertices of the
polyhedron. In addition, consider a set of points '1J(A) in v-dimensional space ~v, A =
(al, ... ,av) such that the following sums vanish:

v
I,ajxj,k =0 for 1 ~ k ~ 3 (Ila)
i=1
v
raj =0 (Ub)
i=1

Equation lla may also be viewed as three orthogonality relationships between the v-
dimensional vector A = (al, ... ,av) and the three v-dimensional vectors (Xl,k,X2,k, ... ,
xv,k), 1 ~ k ~ 3. Now consider the locations of the vertices of the polyhedron as the
following v x 4 matrix:

DO= (
XI,I XI,2
X2,l X2,2 XI,3
. .
X2,3 1
1 J (12)

xv,} XV ,2 x V ,3

Consider the columns of DO as vectors in ~v. Since DO has rank 4, the four columns of
DO are linearly independent. Hence the subspace 9If(X) of ~v represented by these four
linearly independent columns has dimension 4. Its orthogonal complement !M(A).L = (A e
~v I A·X = 0 for all X e 9If(X)} coincides with V(A) defined above by equations lla
and 11 b. Therefore:
dim '1J(A) = dim 9If(A).L = v - dim 9If(X) = v - 4 (13).
Now define the following v x (v-4) matrix:

(
al,}
a2,1
al,2 .
a2,2 .
.
.
.
.
al,v-4
a2,v-4
J
DI = " . (14)
" .
av,} aV,2 aV,v-4

The v rows of D} may be considered as vectors in (v-4)-dimensional space; conventionally


thejth row is denoted by ij = (aj,l, aj,2, ... ,aj,v-4) for j = 1, ... ,v.
The final result of this construction is the assignment of a point Xj in (v-4)-
dimensional space (~v-4) to each vertex Xj of the polyhedron. The collection of v points
Xl, .. .xv in ~v-4 is called a Gale transform of the set of vertices X..... Xv of the polyhedron
in question. The following features of a Gale transform of a polyhedron should be noted:
122 R. B. KING

(1) Gale transforms Xj and Xk of two or more vertices of a polyhedron may lead to the
same point (i.e., the same v-4 coordinates) in (v-4)-dimensional space (9t v-4). In other
words some points of a Gale transform may have a multiplicity greater than one so that the
Gale transform of a polyhedron in such cases contains fewer distinct points than the
polyhedron has vertices.
(2) The Gale transform depends upon the location of the origin in the coordinate
system. Therefore, infinitely many Gale transforms are possible for a given polyhedron.
Geometrically a Gale transform of a polyhedron corresponds to a projection of the v
vertices of a (v-I)-dimensional simplex (i.e., the higher dimension "analogue" of the
tetrahedron in three dimensions) into a (v-4)-dimensional hyperplane. 60 Since infinitely
many such projections are possible, the Gale transform for a given polyhedron is not
unique.
In practice, it is easier to work with Gale diagrams corresponding to Gale
transforms of interest. Consider a Gale transform of a (three-dimensional) polyhedron
having v vertices XI. ... xv as defined above. The corresponding Gale diagram i1, ... ,xv is
defmed by the following relationships:
Xj = 0 ifij =0 (I5a)
Xj = I~II if Xj '" 0 (I5b)
In equation 15b 1li;1I is the length (i.e.,"" a 2 j,1 + a2 j,2 + ... + a 2 j,v-4 ) of the vector Xj .
If v-4 = 1 (i.e., v =5), Gale diagrams can only contain the points of the straight line 0, 1,
and -1 of varying multiplicities mo, ml, and m-lo respectively, where mo ~ 0, m1 ~ 2, and
m-1 ~ 2. If v-4 = 2 (i.e., v = 6) Gale diagrams can only contain the center and
circumference of the unit circle. These two types of Gale diagrams (Figure
3) are of interest for the study of polyhedral isomerizations since they represent significant
structural simplifications of the corresponding polyhedra.
The following properties of Gale diagrams corresponding to three-dimensional
polyhedra are of interest since they impose important restrictions on configurations of
points which can be Gale diagrams:
(1) Any (v-5)-dimensional plane passing through the central point of the Gale diagram
bisects the space of the Gale diagram into two halfspaces. Each such halfspace must
contain at least two vertices (or one vertex of multiplicity 2) of the Gale diagram not
including any vertices actually in the bisecting plane or hyperplane. Such a halfspace is
called an open halfspace. Violation of this condition corresponds to a polyhedron with the
impossible property of at least one pair of vertices not connected by an edge which is
closer in three-dimensional space than another pair of vertices which is connected by an
edge.
(2) The set of vertices of a polyhedron not forming a given face or edge of the
polyhedron is called a co/ace of the polyhedron. The regular octahedron is unusual since
all of its faces are also cofaces corresponding to other faces. The interior of a figure
formed by connecting the vertices of a Gale diagram corresponding to a coface must
contain the central point.
(3) The central point is a vertex of a Gale diagram if and only if the corresponding
polyhedron is a pyramid. The central vertex of such a Gale diagram corresponds to the
apex of a pyramid which is the coface corresponding to the base of the pyramid.
POLYHEDRAL DYNAMICS 123

2--1--2 2--3

Square pyramid Trigonal bipyramid

Trigonal prism Pentagonal pyramid C2 6-vertex polyhedron


with 2 quadF-ilateral
faces

Polyhedra with
6 vertices
11 edges ___
6 triangular faces
1 quadrilateral faces

Octahedron Bicapped tetrahedron

Figure 3: Standard Gale diagrams for all polyhedra having five and six vertices. Balanced
diameters are indicated by bold lines.
124 R. B. KING

Nonplanar isomerizations of five- and six-vertex polyhedra correspond to allowed


vertex J;Tlotions in the corresponding Gale diagrams in Figure 3. In this context an allowed
vertex motion of a Gale diagram is the motion of one or more vertices which converts the
Gale diagram of a polyhedron into that of another polyhedron with the same number of
vertices without ever passing through an.impossible Gale diagram such as one with an
open halfspace containing only one vertex of unit multiplicity. Since two polyhedra are
combinatorially equivalent if and only if their Gale diagrams are isomorphic, such allowed
vertex motions of Gale diagrams are faithful representations of all possible non-planar
polyhedral isomerizations.
The application of Gale diagrams to the study of isomerizations of five-vertex
polyhedra is nearly trivial but provides a useful illustration of this method. The only
possible five-vertex polyhedra are the square pyramid and trigonal bipyrarnid. Their Gale
diagrams (Figure 3) are the only two possible one-dimensional five-vertex Gale diagrams
which have the required two vertices in each open halfspace (i.e., m 1 ~ 2 and
m-l ~ 2). The only allowed vertex motion in a Gale diagram of a trigonal bipyramid
involves motion of one point from the vertex of multiplicity 3 through the center point to
the vertex originally of multiplicity 2 as follows:

~
abc---de
-- ab---c~de
- ab---cde

c/I"\

\ile
.. L\
C,,\/
e
..
d
/11"-
a~\I/b
e

Trigonal bipyramid - Square pyramid -Trigonal bipyramid

This process int\(fchanges the vertices of multiplicities 2 and 3 and leads to an equivalent
Gale diagram corresponding to an isomeric trigonal bipyramid. The motion through the
center point of the Gale diagram corresponds to the generation of a square pyramid
intermediate in the non-planar degenerate isomerization of a trigonal bipyramid. This, of
course, is the Berry pseudorotation process40 ,46 which is the prototypical dsd process.
The choice of three points to move away from the vertex of multiplicity 3 in the Gale
diagram of a trigonal bipyramid corresponds to the presence of three degenerate edges in a
trigonal bipyramid. This analysis of the Gale diagrams of the two possible five-vertex
polyhedra shows clearly that the only possible nonplanar isomerizations of five-vertex
polyhedra can be represented as successive dsd processes corresponding to successive
Berry pseudorotations.
The Gale diagrams of six-vertex polyhedra (Figure 3) can be visualized most clearly
if all of the diameters containing vertices are drawn. Some Gale diagrams of six vertex
polyhedra have diameters with vertices of unit multiplicity at each end. Such diameters
may be called balanced diameters and are indicated by bold lines in Figure 3. The two
vertices of a balanced diameter in the Gale diagram of a six-vertex polyhedron form an edge
which is a coface corresponding to a quadrilateral face. Gale diagrams drawn to maximize
the multiplicities of the vertices and the numbers of balanced diameters consistent with the
POLYHEDRAL DYNAMICS 125

polyhedral topology are called standard Gale diagrams. The Gale diagrams depicted in
Figure 3 are the standard Gale diagrams for the six-vertex polyhedra in question. The
number of balanced diameters in a standard Gale diagram of a six-vertex polyhedron is
equal to the number of quadrilateral faces of the polyhedron. The pentagonal pyramid is
the only six-vertex polyhedron for which the center of the circle is a vertex of the
corresponding standard Gale diagram.
The standard Gale diagrams of the trigonal prism and octahedron illustrate another
interesting feature of Gale diagrams, namely the ability to draw Gale diagrams so that all
symmetry elements of the corresponding polyhedron are preserved. The C3 symmetry
elements of both the trigonal prism and octahedron are readily apparent in their standard
Gale diagrams passing through the center perpendicular to the plane of the circle (Figure 3).
In the case of the trigonal prism, the three C2 axes of its D3h point group correspond to the
three balanced diameters of the corresponding standard Gale diagmm. In the case of the
octahedron, which has the 0 h point group, the reflection planes cr h correspond to
permuting the two vertices of an octahedron forming a vertex of multiplicity two in the
corresponding standard Gale diagram while keeping the other vertices fixed. The C2 and
C4 rotation axes of the octahedron pass through the center and a vertex of multiplicity two
in the corresponding standard Gale diagram and permute the other four vertices forming the
two other standard Gale diagram vertices of multiplicity two in various ways.
Polyhedral isomerizations in six-vertex polyhedra may be described by allowed
motions of the vertices of their Gale diagrams along the circumference of the unit circle or
through the circle center in the case of polyhedral isomerizations involving a pentagonal
pyramid intermediate. However, vertex motions are not allowed if at any time they
genemte one or more forbidden diameters containing three or more vertices. Using these
techniques all non-planar degenerate isomerizations of six-vertex polyhedra can be
decomposed into sequences of eight fundamental processes, namely two processes through
pentagonal pyramid intermediates, five processes which are variations of single diamond-
square-diamond processes, and the triple dsd degenerate isomerization of an octahedron
through a trigonal prism intermediate on which the Baila.r61 and Ray and Dutt62 twists of
M(bidentate)J complexes are based. The Gale diagram for the last process can be depicted
as follows:

----
8,
I I b
e-I-d
'f/
/c

-- b1CI
8 \:::;.
1
j,a\
e--f

Octahedron ___ Trigonal prism _ Octahedron


126 R. B. KING

Note that in the fIrst (diamond-square) stage of this triple dsd process leading from the
octahedron to the trigonal prism, one vertex from each of the three vertex pairs (i.e., ad,
be, and cf) in the Gale diagram of the octahedron must move in the same direction in a
concerted manner preserving the C3 axis in order to avoid violating the "half-space rule."
The standard Gale diagram of the trigonal prism is reached when three balanced diameters
are formed. Similarly, in the second (square-diamond) stage leading from the trigonal
prism to an isomeric octahedron these three vertices continue to move in a concerted
manner so as to preserve the C3 axis.

5. Macroscopic Models: Topological Representations

Macroscopic models depict the relationship between different permutational


isomers. Such models can make use of topological representations, which are reaction
graphs63 describing the relationships between the different permutational isomers of a
given polyhedron. In such a reaction graph the vertices correspond to isomers and the
edges correspond to isomerization steps. The number of vertices correspond to the isomer
count [ = n!/IRI (see equation 8). The degree of a vertex corresponds to the number of new
permutational isomers generated from the isomer represented by the vertex in a single step;
this is called the connectivity, a, of the vertex. Topological representations can be
conveniently classified by the number of vertices in the polyhedra participating in the
rearrangements.

5.1 FOUR-VERTEX POLYHEDRA

The only combinatorially distinct four-vertex polyhedron is the regular tetrahedron


(Figure 1) so that non-planar isomerizations of tetrahedra are not possible. However, a
tetrahedron can be converted to its mirror image (enantiomer) through a square planar
intermediate. The isomer count for the tetrahedron, ftet. is 4!/ITI = 24/12 = 2 and the
isomer count for the square, [sq, is 4!/lD41 = 24/8 = 3. A topological representation of this
process is a K2.3 bipartite graph, which is derived from the trigonal bipyramid by deletion
of the three equatorial-equatorial edges as depicted as follows:
POLYHEDRAL DYNAMICS 127

The two axial vertices (labeled Td) correspond to the two tetrahedral isomers and the three
equatorial vertices (labeled D4h) correspond to the three square planar isomers. The
connectivities of the tetrahedral (Olet) and square planar (Osq) isomers are 3 and 2,
respectively, in accord with the degrees of the corresponding vertices of the K2,3 graph.
Thus ltetOtet = IsqOsq = 6; this is an example of the closure condition IaOa = IbOb required
for a topological representation with vertices representing more than one type of
polyhedron.
128 R. B. KING

10-Vertex Petersen's Graph


Figure 4: (a) Top: the 20-vertex Desargues-Levy graph as a topological representation of
the dsd isomerizations of the 20 trigonal bipyramid isomers; (b) Bottom: the ten-vertex
Petersen's graph as a topological representation of the dsd isomerizations of the 10 trigonal
bipyramid enantiomer pairs.
POLYHEDRAL DYNAMICS 129

5.2 FIVE-VERTEX POLYHEDRA

The two combinatorially distinct five-vertex polyhedra are the trigonal bipyramid
and the square pyramid. The conversion of a trigonal bipyramid into an isomeric trigonal
bipyramid through a dsd process involving a square pyramid intennediate has been
discussed above. Some interesting graphs (Figure 4) are found in the topological
representations for this process. The trigonal bipyramid has an isomer count I =5!/1D31 =
120/6 =20 corresponding to 10 enantiomeric pairs. A given trigonal bipyrarnid isomer can
be described by the labels of its two axial positions (i.e., the single pair of vertices not
connected by an edge) with a bar used to distinguish enantiomers. In a single degenerate
dsd isomerization of a trigonal bipyramid through a square pyramid intennediate, both axial
vertices of the original trigonal bipyramid become equatorial vertices in the new trigonal
bipyrarnid leading to a connectivity of three for dsd isomerizations of trigonal bipyramids.
The corresponding topological representation thus is a 20 vertex graph in which each vertex
has degree 3. However, additional properties of dsd isomerizations of trigonal bipyramids
exclude the regular (Ih) dodecahedron as a topological representation unless double group
fonn is used to produce pseudohexagonal faces. A graph suitable for the topological
representation of dsd isomerizations of trigonal bipyramids is the Desargues-Levy graph,
depicted in Figure 4 (top).
Less complicated but still useful topological representations can be obtained by
using each vertex of the graph to represent a set of isomers provided that each vertex
represents sets of the same size and interrelationship and each isomer is included in exactly
one set. A simple example is the use of the Petersen's graph (Figure 4 bottom) as a
topological representation of isomerizations of the 10 trigonal bipyramid enantiomer pairs
(E = 5!/ID3hl = 120/12 = 10) by dsd processes. The use of Petersen's graph for this
purpose relates to its being the odd graph 03; an odd graph Ok is defined as follows 64: its
vertices correspond to subsets of cardinality k - 1 of a set S of cardinality 2k - 1 and two
vertices are adjacent if and only if the corresponding subsets are disjoint.

5.3 SIX-VERTEX POLYHEDRA

In six-vertex systems the process of interest is the degenerate triple dsd


isomerization of the octahedron through a trigonal prismatic intennediate, which is the
underlying topology of both the Bailar6 1 and Ray and Dutt62 twists for octahedral
M(bidentateh chelates. The isomer counts are lOCI = 6!/101 = 720/24 = 30 for the
octahedron and lip = 6!1D31 = 720/6 = 120 for the trigonal prism. A pentagonal (Ih)
dodecahedron in double group form can serve for the topological representation for this
process. A face of such a dodecahedron can be depicted as follows:
130 R.B.KING

The midpoints of the 30 edges of the dodecahedron (designated by triangles, ~) are the 30
octahedron isomers. Line segments across a pentagonal face connecting these edge
midpoints correspond to triple dsd isomerization processes; the midpoints of these lines
(designated by diamonds, .) correspond to the 120 trigonal prismatic isomers with 10
such isomers being located in each of the 12 faces of the pentagonal dodecahedron. The
ten lines on a face representing isomerization processes form a KS graph. This system is
closed since the connectivities of the octahedron (8oc t) and trigonal prism (dtp) are 8 and 2,
respectively, leading to the closure relationship loct8oct =Itp8tp =240.

5.4 POLYHEDRA WITH MORE THAN SIX VERTICES

Development of topological representations for systems having more than six


vertices is complicated by intractably large isomer counts. Thus the isomer count of the
seven-vertex polyhedron with the largest number of symmetry elements, namely the
pentagonal bipyramid, is 7!/lDsl = 5040/10 =504. Similarly the isomer counts of the cube,
hexagonal bipyramid, square antiprism, and bisdisphenoid are 40320/24 = 1680,40320/12
= 3360, 40320/8 = 5040, and 40320/4 = 10080, respectively. Graphs corresponding to
topological representations involving such large numbers of polyhedral isomers are clearly
unwieldy and unmanageable. However, the 'problem of representing permutational
isomerizations in seven- and eight-vertex polyhedra can be simplified if subgroups of the
symmetric groups Sn (n = 7, 8) can be found which contain all of the symmetries of all of
the polyhedra of interest. This is not possible for the seven-vertex system since there is no
subgroup of S7 that contains both the five-fold symmetry of the pentagonal bipyrarnid and
the three-fold symmetry of the capped octahedron. The situation with the eight-vertex
system is more favorable since the wreath product group65,66,67 S4[S2] of order 384
contains all of the symmetries of the cube, hexagonal bipyramid, square antiprism, and
bisdisphenoid,68 which are all of the eight-vertex polyhedra of actual or potential chemical
POLYHEDRAL DYNAMICS 131

interest. The major effect of reducing the symmetry by a factor of 105 ( 3 x 5 x 7) in going
from Sg to S4[S2] is the deletion of five-fold and seven-fold symmetry elements. Such
symmetry elements are not of interest in this context since none of the 257 eight-vertex
polyhedra has five-fold symmetry elements31.3 2 and the only eight-vertex polyhedron
having a seven-fold symmetry element is the heptagonal pyramid, which is not of interest
in this particular chemical context. Restricted isomer counts 1* = 384/IRI based on
subgroups of the wreath product group S4[S2] rather than the symmetric group Sg are the
more manageable numbers 16,32,48, and 96 for the cube, hexagonal bipyramid, square
antiprism, and bisdisphenoid, respectively.
The concept of restricting vertex permutations in eight-vertex systems to the wreath
product group S4[S2] rather than the fully symmetric Sg group can be restated in graph-
theoretical terms using the hyperoctahedral graph H4. 64 Therefore such a restriction of
permutations from Sg to S4[S2] can be called a hyperoctahedral restriction. The
hyperoctahedral graphs underlying this restriction are designated as Hn and have 2n
vertices and 2n(n - 1) edges with every vertex connected to all except one of the remaining
vertices so that each vertex of Hn has degree 2(n - 1). The name "hyperoctahedral" comes
from the fact that an Hn graph is the I-skeleton of the analogue of the octahedron (called the
"cross-polytope") in n-dimensional space.!. The hyperoctahedral graphs H2 and H3 thus
correspond to the square and octahedron, respectively. The S4[S2] wreath product group
is the automorphism ("symmetry") group of the hyperoctahedral graph H4 just as the Sg
symmetric group is the automorphism group of the complete graph Kg.
U sing these ideas topological representations for isomerizations of eight-vertex
polyhedra are depicted in Figures 5 and 6. Vertex and edge midpoints in these
representations correspond to the E* = 384/IGI hyperoctahedrally restricted enantiomer
pairs (E* = 8, 16, 24, and 48 for the cube, hexagonal bipyramid, square antiprism, and
bisdisphenoid, respectively) except because of the hyperoctahedral reduction in symmetry,
the number of points for the square antiprism must be doubled. 69 Figure 5 is a K4,4
bipartite graph in which the 8 cube enantiomer pairs are located in the centers of the
hexagons and the 16 hexagonal bipyramid enantiomer pairs are located at the edge
midpoints. Since both the cube and hexagonal bipyramid are forbidden polyhedra (i.e.,
cannot be formed using only s, p and d orbitals),33 this portion of the topological
representation for hyperoctahedrally restricted eight-vertex systems is not accessible if only
s, p and d orbitals are available for chemical bonding.
The detailed structure of a hexagon wheel corresponding to a given pair of cube
enantiomers is depicted in Figure 6. The vertices of the hexagon correspond to the square
antiprisms that can be generated from the cube in the center by twisting opposite pairs of
faces. The midpoints of the hexagon edges correspond to bisdisphenoid enantiomer pairs.
Traversing the circumference of a given hexagon corresponds to a sequence of double dsd
processes interconverting the bisdisphenoids located at the midpoints of the two joined
hexagonal edges meeting at a vertex through the square anti prism intermediate represented
by the vertex joining the edges. Since both the bisdisphenoid and square antiprism can be
formed using only s, p, and d orbitals, the circumference of the hexagon is accessible in
MLg systems in which the central atom M has the usual sp 3d5 nine-orbital manifold. Thus
in the usual situation not involving f orbitals, isomerizations are restricted to the
circumference of a given hexagon in Figure 5 and cannot occur by moving from one
hexagon to another.
132 R. B. KING

D4d Square Antiprism


."'.~
OhCUOO~~.--------"-------

Figure 5: The K.!,4 bipartite graph for the hyperoctahedrally restricted isomerizations of
eight-vertex polyhedra indicating points corresponding to one each of the cube, square
antiprism, hexagonal bipyramid, and bisdisphenoid isomers.
POLYHEDRAL DYNAMICS 133

Figure 6: The detailed structure of a hexagonal wheel corresponding to a given pair of cube
enantiomers. Spokes labeled B correspond to cube-square antiprism interconversions
whereas edges labeled C correspond to interconversions from one square antiprism isomer
to another through a bisdisphenoid intermediate. The double dsd isomerization of a
bisdisphenoid to another through a square antiprism intermediate corresponds to movement
from the center of one edge representing the initial bisdisphenoid through a vertex
representing a square antiprism intermediate to the center of an adjacent edge representing
the final bisdisphenoid.
134 R.B. KING

6. Literature References

(1) B. Griinbaum, Convex Polytopes, Interscience Publishers, New York, 1967.


(2) X. Liu, D. J. Klein, T. G. Schmalz, and W. A. Seitz, 1. Comput. Chem., 12,
1252 (1991).
(3) F. A. Cotton, Accts. Chem. Res., 1, 257 (1968).
(4) E. L. Muetterties, 1. Am. Chem. Soc., 90, 5097 (1968).
(5) E. L. Muetterties, 1. Am. Chem. Soc., 91, 1636 (1969).
(6) E. L. Muetterties, 1. Am. Chem. Soc., 91, 4115 (1969).
(7) E. L. Muetterties and A. T. Storr, 1. Am. Chem. Soc., 91, 3098 (1969).
(8) M. Gielen and 1. Nasielski, Bull. Soc. Chim. Belges, 78, 339 (1969).
(9) M. Gielen and 1. Nasielski, Bull. Soc. Chim. Belges, 78,351 (1969).
(10) M. Gielen, C. Depasse-Delit and 1. Nasielski, Bull. Soc. Chim. Belges, 78, 357
(1969)
(11) M. Gielen and C. Depasse-Delit, Theor. Chim. Acta, 14,212 (1969).
(12) M. Gielen, G. Mayence, and 1. Topart, 1. Organometal. Chem., 18, 1 (1969)
(13) M. Gielen, M. de Clercq, and J. Nasielski, 1. Organometal. Chem., '18, 217
(1969).
(14) M. Gielen and N. Vanlautem, Bull. Soc. Chim. Belges, 79, 679 (1970)
(15) M. Gielen, Bull. Soc. Chim. Belges, 80,9 (1971).
(16) J. I. Musher, 1. Am. Chem. Soc., 94, 5662 (1972).
(17) J. I. Musher, Iriorg. Chem., 11, 2335 (1972).
(18) W. G. Klemperer,l. Chem. Phys., 56, 5478 (1972).
(19) W. G. Klemperer, 1. Am. Chem. Soc., 94,6940 (1972).
(20) W. G. Klemperer, 1. Am. Chem. Soc., 94, 8360 (1972).
(21) J. Brocas, Top. Curro Chem., 32,43 (1972).
(22) 1. Brocas, in Advances in Dynamic Stereochemistry. M. Gielen, ed., Freund
Publishing Co., Tel Aviv, 1985, Volume 1, pp. 43-88.
(23) R. B. King in Advances in Dynamic Stereochemistry. M. Gielen, ed., Freund
Publishing Co., Tel Aviv, 1988, Volume 2, pp. 1-36.
(24) R. B. King, Applications of Graph Theory and Topology in Inorganic Cluster and
Coordination Chemistry, CRC Press, Boca Raton, Florida, 1992.
(25) R. B. King, 1. Am. Chem. Soc., 91, 7211 (1969).
(26) B. F. G. Johnson, 1. Cook, D. Ellis, S. Hefford, A. Whittaker, Polyhedron, 8,
2221 (1990).
(27) V. Schlegel, Nova Acta Leop. Carol., 44, 343-459 (1883).
(28) F. Harary and E. M. Palmer, Graphical Enumeration, Academic Press, New York,
1973, p. 224.
(29) W. T. Tutte, 1. Combin. Theory Ser. B., 28, 105 (1980).
(30) A. J. W. Duijvestijn and P. 1. Federico, Math. Comput., 37, 523 (1981).
(31) P. J. Federico, Geom. Ded., 3,469 (1975).
(32) D. Britton and 1. D. Dunitz, Acta Cryst., A29, 362 (1973).
(33) R. B. King, Theor. Chim. Acta, 64,453 (1984).
(34) F. J. Budden, The Fascination of Groups, Cambridge University Press, London,
1972.
POLYHEDRAL DYNAMICS I35

(35) C. D. H. Chisholm, Group Theoretical Techniques in Quantum Chemistry,


Academic Press, New York, 1976.
(36) F. A. Cotton, Chemical Applications a/Group Theory, Wiley, New York, 1971.
(37) J. A. Pople, 1. Am. Chem. Soc., 102,4615 (1980).
(38) S. Fujita, Symmetry and Combinatorial Enumeration in Chemistry, Springer-
Verlag, Berlin, 1991.
(39) W. N. Lipscomb, Science, 153, 373 (1966).
(40) R. S. Berry, 1. Chem. Phys., 32, 933 (1960).
(41) R. B. King, Inorg. Chim. Acta, 49,237 (1981).
(42) R. B. King, Theor. Chim. Acta, 64, 439 (1984).
(43) D. Gale in Linear Inequalities and Related Systems, H. W. Kuhn and A. W.
Tucker, eds., Princeton, 1956, pp. 255-263.
(44) R. B. King, Inorg. Chem., 24, 1716 (1985).
(45) R. B. King,lnorg. Chem., 25,506 (1986).
(46) R. R. Holmes, Accts. Chem. Res., 5, 296 (1972).
(47) P. C. Lauterbur and F. Ramirez, 1. Am. Chem. Soc., 90,6722 (1968).
(48) D. L. Kepert, Prog. Inorg. Chem., 25,41 (1979).
(49) R. B. King and D. H. Rouvray,l. Am. Chem. Soc., 99, 7834 (1977).
(50) R. B. King, Inorg. Chim. Acta, 49,237 (1981).
(51) B. M. Gimarc and J. J. Ott,lnorg. Chem., 25,83 (1986).
(52) J. J. Ott, C. A. Brown, and B. M. Gimarc, Inorg. Chem., 28,4269 (1989).
(53) B. M. Gimarc and 1. 1. Ott, Inorg. Chem., 25,2708 (1986).
(54) B. M. Gimarc and 1. 1. Ott, Main Group Metal Chem., 12,77 (1989).
(55) L. J. Guggenberger and E. L. Muetterties, 1. Am. Chem. Soc., 98, 7221 (1976).
(56) D. J. Wales and A. J. Stone, Inorg. Chem., 26, 3845 (1987).
(57) D. M. P. Mingos and R. 1. Johnston, Polyhedron, 7,2437 (1988).
(58) R. B. King, 1. Mol. Struct. THEOCHEM, 185, 15 (1989).
(59) P. McMullen and G. C. Shephard, Convex Polytopes and the Upper Bound
Conjecture, Cambridge University Press, Cambridge, England, 1971.
(60) P. McMullen and G. C. Shephard, Mathematika, 15, 223 (1968).
(61) 1. C. Bailar, Jr., 1. Inorg. Nuc/. Chem., 8, 165 (1985).
(62) P. Ray and N. K. Dutt, 1. Indian Chem. Soc., 20, 81 (1943).
(63) B. M. Gimarc and J. 1. Ott in Graph Theory and Topology in Chemistry, R. B.
King and D. H. Rouvray, eds., Elsevier, Amsterdam, 1987, pp. 285-301.
(64) N. L. Biggs, Algebraic Graph Theory, Cambridge University Press, London,
1974.
(65) G. P6lya, Acta Math:, 68, 143 (1937).
(66) N. Debruijn in Applied Combinatorial Mathematics, E. F. Beckenbach, ed., Wiley,
New York, 1964.
(67) 1. G. Nourse and K. Mislow, 1. Am. Chem. Soc., 97, 4571 (1975).
(68) R. B. King, Inorg. Chem., 20,363 (1981).
(69) R. B. King, Theor. Chim. Acta, 59,25 (1981).
REACTION GRAPHS

ALEXANDRU ~ BALABAN

Polytechnic University, Department of Organic Chemistry,


Splaiul Independentei 313, 77206 Bucharest, Roumania,
and Texas A & M University, Galveston, TX 77553-1675, USA.

Contents
1.lntroduction
2.Reaction graphs of rearrangements via carbocations
2. 1. ETHYL CARBENIUM IONS
2. 2. AUTOMERIZATIONS OF HOMOVALENIUM CATIONS
2. 3. PARTLY DEGENERATE REARRANGEMENTS VIA CARBOCATIONS
2. 4. MECHANISMS OF REARRANGEMENTS LEADING TO DIAMOND
HYDROCARBONS AND DERIVATIVES
2. 4. 1. Adamantane
2. 4. 2. Diamantane
2. 4. 3. Tetracyclotridecanes
2. 4. 4. Tricycloundecanes
2. 4. 5. Tetracycloundecanes
2. 4. 6. Spiro{adamantane-2, 1'-cyclobutaneJ
3.Automerization of bullvalene, other valence isomers of
annulenes, and azabullvalene
4.Rotation in molecular propellers
5.Reaction graphs for rearrangements of metallic complexes

5. 1. PENTACOORDINATE COMPLEXES
5. 1. 1. Trigonal-bipyramidal complexes
5. 1. 2. Tetragonal-pyramidal complexes
5. 2. TETRACOORDINATE COMPLEXES
5. 3. HEXACOORDINATE COMPLEXES
5. 3. 1. Octahedral complexes
5. 3. 2. Axially distorted octahedra
5. 3. 3. Trigonal prismatic complexes
5.4. OCTACOORDINATECOMPLEXES
5. 4. 1. Square antiprisms
5. 5. OTHER INORGANIC COMPLEXES
6.Xenon hexafluoride
7.Heptaphosphide trianion
a.Kinetic graphs, synthon graphs, and graph transforms
9.Conclusions
137
D. Bonchev and O. Mekenyan (eds.), Graph Theoretical Approaches to Chemical Reactivity, 137-180.
© 1994 Kluwer Academic Publishers.
138 A. T. BALABAN

1.lntroduction

Whereas molecular (constitutional) formulas have been modeled by graphs


for more than a century, since Cayley counted alkane isomers by means of
4-trees, modeling of chemical reactions and of chemical reactivity by means of
graphs is of more recent date.
Unlike molecular (constitutional) graphs in which points symbolize atoms
and lines represent covalent bonds, in reaction graphs 1 the points (vertices)
symbolize chemical species (molecules or reaction intermediates such as
rearranging carbenium ions), and the lines (edges) represent elementary
reaction steps. They have also been occasionally called coset graphs by Collins,2
topological representations by Gielen, Nasielski, Brocas and coworkers 3,4 or
isomerization/rearrangement graphs by Schleyer, Farcasiu, Wipke, Osawa and
their coworkers. 5 •6
This chapter will review graph-theoretical applications connected with
reaction graphs and will briefly mention in Section 8 kinetic graphs and
synthon graphs. Sections 2 - 4 deal with organic'chemical applications, and
Sections 5 - 7 with applications in inorganic chemistry.

2. Reaction graphs of rearrangements via carbocations

2.1. ETHYL CARBENIUM IONS

The first reaction graphs were published in 1966 for representing all
possible interconversions of carbenium ions. 1 Thus, a pentasubstituted ethyl
cation can undergo an elementary reaction step via three different pathways
involving a 1,2-shift of any of the three substituents in B'position relative to
the positive charge (Whitmore's rule), as shown in Fig 1. Each of the three new
carbenium ions again may react via three different pathways (one of which
reverts the preceding 1,2·shift). The process continues leading to a total of 20
different carbenium ions if the two carbon atoms of the ethyl group are
distinguishable (e. g. by isotopic labeling), or to 10 different carbenium ions if
they are not.
2 ,
:!-1.151
/---+,5 !23.1
.1 3 1 :S-1.141
1 4
~
2t-\
3 5
·2
~
'
3
/-+,s
4
2
113.1
;!:-1.251
":5-1.241
1 4
1.451 ~ ! - t 5 l12.)
;i-1.3SI
:;-1.341
2 3

Fig.1. A portion of the reaction graph indicating the three possibilities for
the rearrangement of an ethyl carbenium ion with five different
sUbstituents denoted by 1 - 5. The abbreviated notation for each
carbocation is indicated in brackets. On the arrows one can see the
substituent undergoing the 1,2·shift. All interconversions are reversible.
REACTION GRAPHS 139

If each cation is represented by a vertex, the resulting graph is regular of


degree three (cubic graph) since three lines (edges) meet at each vertex (point).
A regular graph is defined as a graph having all its points of the same degree.
1,2,3* 4,5

Z'3.1'4'l~~*J'4'5
~ y~ ~

1~~1<:r
5~1,34 ~.2
\/"
3,4*1,2) 3,5*1,2,4' 2,4"I,3.l Z 14 23< 1 '4

~ 'V'~
3,4,5* 1,2 2,4,5*1,3 4,5-2,3

4,5-1,7,3
Fig. 2. The Oesargues-Levi graph with the 20 vertices symbolizing
rearranging ethyl carbenium ions. The full notation lists all five
sUbstituents (it can be abbreviated by discarding the triplet of
substituents, and replacing the asterisk by a period).

The notation of Fig. 2 indicates for the 20-vertex reaction graph, called
Oesargues-Levi graph, two groups of digits corresponding to the substituents
bonded to the carbon atoms of the ethyl cation : the group of three digits
corresponds to the substituents attached to the sp3-hybridized carbon atom, and
the group of the two digits corresponds to the substituents attached to the
cationic center (sp2-hybridized carbon atom). The two groups are separated by
an asterisk or a period. The order of the substituents within each group is
irrelevant (conventionally, the increasing order of digits was chosen), but the
order between the doublet and triplet of digits is essential for distinguishing
the two ethyl carbons atoms.
For abbreviating the notation, the group of three digits may be omitted
because these digits can be deduced when the remaining two digits are known ;
a period either after or before the group of two digits distinguishes the order
between the doublet and triplet of digits. We shall use both the full and
abbreviated notation : the abbreviated notation in Fig. 1, and the full notation in
Fig. 2.
The largest graph-theoretical distance between two vertices (i. e. the
diameter) of the Oesargues-Levi graph is 6 : for any vertex i, there is a unique
vertex at distance 6. This is the "antipode" of vertex i, and it differs only in the
order between the two groups of digits. Thus, the full notation system for two
antipodal vertices is 12.345 and 345.12 ; the abbreviated notation 12. and .12 ;
it will be seen in the section on trigonal-bipyramidal complexes that another
abbreviated notation uses a bar above the group of two digits but there is no
direct equivalence with the present notation.
140 A. T. BALABAN

When there is no isotopic label, antipodes become indistinguishable, and


the Oesargues-Levi graph reduces to a 10-vertex graph called the Petersen
graph.? In its vertex notation the order between the two groups of digits is no
longer relevant, and conventionally one always starts with the doublet and
follows with the tiplet. The Petersen graph may be displyed in several
isomorphic representations. The simplest is shown with the full notation in Fig.
3, which shows how the lower half of the Oesargues-Levi graph folds up over
the corresponding antipodes when this graph reduces to the Petersen graph.

A
t,2,3 e ",5

~
Z,~'I,4,5 t,:~

r Z'3"4<3~"4 ~ 1,3'4.~
_ ' !
13.s~2,'...Lz
1,2,:'3,5
1 I
~.3"1
J _

Fig. 3. The Petersen graph with 10 vertices, conserving the notation of


Fig.2, but disregarding the order between doublet and triplet of digits.

Other representations of the Oesargues-Levi graph are shown in Fig. 4


which includes a representation with a Hamiltonian circuit, i. e. a circuit
encompassing all graph vertices.

23 •

..... .....
.23

14.1J1 .,.... J1
Fig. 4. Alternative representations of the Oesargues-Levi graph with full
and abbreviated notation, respectively.

For the Petersen graph, no Hamiltonian representation is possible, but one


may represent it inside a 9-gon, an 8-gon, a 6-gon or a 5-gon (Fig. 5). This graph
has a very high symmetry, and is known as the 5-cage, because it is the
smallest cubic graph whose girth (smallest circuit) is five. 8 ,g The order of its
symmetry group is 120, and it was demonstrated by Tutte 10 that it is 3-regular
(3-unitransitive), i. e. that any path of length 3 (but not of length 4) may be
mapped on any other path of the same length (a path is defined as a sequence of
adjacent edges such that no edge occurs twice, and that the first and last
vertices are distinct ; the length of the path is the number of its edges).
Alternatively, the Petersen graph can be considered as the odd graph 0 3 ; an odd
graph Ok is defined as follows :11 its vertices correspond to subsets of
cardinality k-1 of a set of cardinality 2k-1, and two vertices are adjacent if
and only if the corresponding subsets are disjoint. The subsets are the
abbreviated notation.
1®4
REACTION GRAPHS 141

. 10

Fig. 5. Alternative representations of the Petersen graph a Hamiltonian


line numbered from 1 to 10 is indicated.

2.2. AUTOMERIZATIONS OF HOMOVALENIUM CATIONS

Since the terminology may not be clear, definitions will be given for a
few terms that will be used in this section. Degenerate rearrangements 0 r
automerizations 12 are those chemical processes in which covalent bonds are
broken and formed, leading to new connectivities, but the structures of the
reactants and products remain unchanged. Examples are Cope rearrangements
(eq. 1), or isotopic rearrangements such as those discovered by R. M. Roberts et
al. 13 when n-propylbenzene was treated with aluminum chloride (eq. 2), or the
automerization of phenanthrene-1- 13 C under the influence of AICI 3 (eq.3). 14

'C' -- ':)'
2

--+ (I)

4 ~6 4~ 6

- *
I'h-CHz-CHz-CHJ
AIC1]
* z-CH3 (2)
I'h-CHz-CH

'm ceo
- Ala 3

*~ I (3)
~I# ~I
Automerizations can be detected by isotopic labeling (in order to see
newly formed bonds), or when such reactions occur fast enough, by dynamic NMR
spectroscopy : the low number of peaks in the NMR spectra of bullvalene or
semibullvalene at room temperature is due to rapid reversible Cope
rearrangements, as will be discussed in Section 3.
Table 1 summarizes the types of chemical processes that can occur, and
the corresponding changes in chemical or stereochemical formulas. 15
142 A. T. BALABAN

Table 1. Modification of various types of isomerizations by noting modified or


conserved properties (conformational changes involving rotations around single
bonds are ignored)

Name of the Constitutional Stereochemical


molecular formula formula
transformation

Topology Distance Tridimensional


of between non- orientation
atoms bonded atoms of atoms

Automerization Conserved Conserved Conserved

Enantiomerization Conserved Conserved Modified

Diastereomerization Conserved Modified Modified

Constitutional Modified Modified Modified


(topological)
isomerizations

In the preceding section, we have examined isomerizations proceeding by


1,2-shifts in carbenium ions. These processes explain satisfactorily reactions
such as (2), and in all likelihood also (3).
We shall now discuss other processes which occur via 1,2-shifts in
carbenium ions, and we shall start with homovalenium cations. 16 These are
derived from homovalenes by hydride ion abstraction. Homovalenes are valence
isomers of annulenes 17 plus one extra CH2 group, i. e. they have formulas
(CHhn(CH2). Alternatively, homovalenium cations result by adding a CW group
to a valence isomer of annulene (CHhn'
An interesting reaction graph is that of the classical homotetrahedryl
cation (CH)5 + with a tricyclo[1.1.0.0 2 ,4]pentanic structure. 16 Its automerization
by 1,2-shifts can lead, for each cation, to four differently connected cations
which are structurally identical to the initial cation, therefore the reaction
graph will be regular of degree four (Fig. 6), The notation for each structure
consists of three digits, namely : the numbering of the CH group bearing the
positive charge, followed by a comma and two digits in increasing order
indicating the numbering of CH groups which are not adjacent to the positively
charged CH group. Figure 6 represents the four possibilities for the
rearrangement of structure denoted by 1,34 : on each arrow one can see the bond
involved in the 1,2-shift.
REACTION GRAPHS 143

~,

.••
~,
-!J"
.~.
~
5,"
• 5,U

I ')Y • ~ I

.~, f." '~I


•• •
l,U l,'S

Fig. 6. A portion of the reaction graph for the automerization of the


classical homotetrahedryl cation.

The complete graph was shown to possess CS1 x C4 2 = 30 vertices,


where CS1 denotes combinations of five digits taken singly (the digit before the
comma). and C4 2 indicates pairwise combinations of the remaining four digits
after the comma.
The reaction graph has Hamiltonian circuits, girth 4, and diameter 4, each
point (a, be) having one "antipode" at distance four with notation (a, de), where
a-e denote the five digits from 1 to 5. The graph is vertex- and edge-transitive.
A representation with four-fold symmetry is shown in Fig. 7. 16

1." l."

5,n

Fig. 7. Reaction graph for the homotetrahedryl automerization.

A representation with five-fold symmetry was found later.18 It served as


an exercise in finding a canonical numbering of the 30 vertices such that when
the adjacency matrix was read sequentially it produced the smallest binary
number. The final canonical numbering obtained by Randic is presented in Fig. 8.
Randic also proved that the order of the symmetry group for the graph of the
homotetrahedryl automerization is 120, just as for the Petersen graph. The 120
permutations for these two graphs belong to the symmetric group 8 5 ,
144 A. T. BALABAN

Fig. 8. Same as in Fig. 7, but with a Hamiltonian circuit. The canonical


numbering of vertices leads to the smallest binary number when the
adjacency matrix is read sequentially.

Interestingly, the most stable structure of (CH)5 + was calculated to be


non-classical, with a tetragonal-pyramidal geometry.19, 20
Other homovalenium cations also undergo automerizations, but their
reaction graphs have too many vertices to be represented. Thus, the homocubyl
cation 21 also leads to a regular graph of degree four like the preceding one, but
it has 45,630 vertices. 16

2. 3. PARTLY DEGENERATE REARRANGEMENTS VIA CARBOCATIONS

We shall analyze in detail one case of homovalenium cations which


undergo partly degenerate rearrangements, and discuss summarily a few others.
Cyclopentadienyl cations (CH)5 + having a classical structure 5 may
become converted to a non-classical (tetragonal pyramidal) structure ; the
latter turns out to be the most stable one, as shown 19, 20 by theoretical
calculations cited above. We shall discuss here the reaction graph for the
classical structure. 16
A non-de localized cyclopentadienyl cation can undergo two different
1,2-shifts converting it into bicyclo[2.1.0]pent-2-ene-5-yl cations ; each of
these can lead by a non-degenerate rearrangement via 1,2-shifts to a
cyclopentadienyl structure and by two other 1,2-shifts to automerization (Fig.
9). The notation for each structure consists of two equivalent groups of digits
from the set 1 to 5 indicating the numberings of CH groups : the first digit
indicates the location of the positive charge CH+; it is folowed by a comma and
then a group of three ordered digits indicating a chain of three CH groups which
in turn are non-adjacent, non-adjacent, and adjacent to the first CH+ g ro up.
Adjacency rules for any such classical cations are indicated in Fig. 10 in one-
to-one correspondence with Fig. 9.
REACTION GRAPHS 145

-
Ho".d.,.".rof. Ollflll".'O'.
, r IIO"O"V."'.'"

eL1f:2
~,
..:!i S02
, ,
{5,n41
/<;,1111
'f,'''}
/1.U2] ,
'\)I-
, ,
sa
,
, ;7

-
S(;;;:t -45
, , -
2
(1.501
~
sOr
, -
(3,5'11
/I.,"S fV~J
I
/I."nl f 1.5111 (1.""
Fig. 9. Interconversions of classical bicyclo[2.1.0]pent-2-ene-5-yl and
cyclopentadienyl cations.

Fig. 10. Notation for the bicyclic and monocyclic structures presented in
Fig. 9 ; letters a - e symbolize digits 1 - 5.

The complete reation graph (Fig. 11) has 60 vertices of degree 2 (white
points) and 60 vertices of degree 4 (black points). The degenerate processes
involving only vertices of degree 4 are represented by ten 6-membered
circuits. They are connected pairwise to each other by four paths having a
vertex of degree 2 between each pair of vertices with degree 4. The whole graph
is reminiscent of the Petersen graph. with each vertex being replaced by a
6-circuit. The simplified notation includes only the digit preceding the comma.

Fig. 11. Reaction graph for the degenerate and non-degenerate


rearrangements of cations from Figs. 9 and 10 using simplified notation.
146 A. T. BALABAN

The homobenzprismyl cations (CHh+ undergo partially degenerate


rearrangements interconverting them. Fig. 13 represents 16 the interconversions
of one such cation A with the charge localized at the CH group numbered 1. Two
automerizations and two isomerizations are possible ; the latter lead to two
different homobenzprismyl cations B, B' which can undergo only four
isomerizations. On each arrow in Fig. 12 is indicated the bond involved in the
1,2-shift. Doring 22 also considered some of these interconversions.

Fig. 12. A portion of the reaction graph corresponding to rearrangements


of the homobenzprismyl cation via 1,2-shifts.

The total reaction graph is regular of degree 4 ; it has 2520 vertices


corresponding to structure A and 1760 vertices corresponding to structure B.
The resulting graph with 4280 vertices is too complicated to be analyzed in
detail ; one can be show that it possesses 12-membered circuits representing
degenerate rearrangements, similar to the hexagons in the preceding case.
The saturated 2-bicyclo[2.2.1]heptyl (norbornyl) cation can undergo three
types of rearrangements : (i) Wagner-Meerwein-type (WM), involving 1,2-shifts
of C-C bonds; (ii) 6,2-endo-hydride shift ; and (iii) 3,2-exo-hydride shift. 23
Each of these proceeds with its own reaction rate. The fastest reaction is the
Wagner-Meerwein shift (even at -154° this reaction leads to equivalence of
1H-NMR peaks) ; between -120 0 and -60 0 one can freeze the endo-6,2-hydride
shift which has an activation energy of 6 kcal/mol ; the slowest process is the
3,2-hydride shift with activation enrgy 12 kcal/mol (both endo and exo), but
this is fast enough at room temperature to lead to a single 1H-NMR peak for all
hydrogens in C7 Hll + (Fig. 13).
s,-k
6~3
714 2
2635

~~3.2~

6~
Sr-~3
6
rf:;7
1
4.3 J;&.
6~"'V>3
714 741 324 2
6253 3526 1675
Fig. 13. The 2-bicyclo[2.2.1]heptyl cation with the three possible classical
1,2-shifts of C-C and C-H bonds, and with notation described in the text.
REACTION GRAPHS 147

The reaction graph has 7! = 5040 vertices ; it has girth 4 and is a cubic,
vertex-transitive and edge-intransitive graph. A portion of this graph is shown
in Fig. 14,17 with the corresponding notation, as shown in Fig. 13 : the label of
the one-carbon bridge is followed by the two bridgehead labels starting with
that one which is adjacent to the charged CW group. On a lower line, the
remaining four numberings follow in the order : the charged CH group, its CH 2
neighbor, the CH 2 group once removed from CW, and lastly its CH 2 neighbor.

Fig. 14. A portion of the reaction graph for the rearrangement of the
norbornyl cation via classical 1,2-shifts of C-C and C-H bonds.

In the reaction graph, a 4-membered ring is surrounded by two 20-gons


and two 4-membered rings ; a hexagon is surrounded by three 4-membered rings
and three 20-gons ; and a 20-gon is surrounded by ten 4-membered rings and ten
hexagons.
One should also bear in mind that the C 7 H 11 + cation is chiral. 23 A reduced
graph with 55 vertices results if chirality and transposition isomers are
superimposed. 2 This reaction graph, for which a special computer program was
devised, was successful for determining likely pathways for the rearrangement
and for designing isotopic labeling experiments in order to firmly establish
which of the several pathways is (or are) followed as the reactant rearranges
to product(s). The program uses the Schreier system of coset generators for
finding the raction graph corresponding to the given permutation group of the
molecular graph. The shortest path of the reaction graph can then account for
the most likely pathway from one vertex to another, i. e. from one isomer or
isotopomer to another one.
The same program was employed for the rearrangement of
4-homoadamantyl (4-tricyclo[4.3.1.1 3,8]undecyl) cation, and for other tricyclic
cations such as brexyl and longifolyl.2,23
148 A. T. BALABAN

The 9-babaralyl cation was determined by Schleyer et al. 24 to show total


degeneracy in its 1H-NMR spectrum at -135°, and by Ahlberg et al. 25 to present
coalescence in the 13C-NMR spectrum at -135°. The structure is represented by
classical formulas, involving tricyclononatrienic and tetracyclononadienic
cations. The mechanism of the rearrangement does not involve Cope reactions,
but rather divinyl-cyclopropyl-carbinyl cationic rearrangements, leading to a
graph whose smallest circuit is six-membered ; the tetracyclic cations lead to
vertices of degree 2 (Fig. 15). If the tricyclic cations would be also considered,
they would lead to vertices of degree 4 owing to vinylic de localization of
charge.
I •
a~5
9/2~_~ ~ 9

aQ5 3 8~~5
.6 ~4
;:;Z4 2 J

2 1t H
8~4 a~5
~
· ~g9
8' 6
2::::'"
.
5#
4
itV
~
J
4

Fig. 15. Reaction graph for the rearrangement of the barbaralyl cation.

2. 4. MECHANISMS OF REARRANGEMENTS LEADING TO DIAMOND


HYDROCARBONS AND DERIVATIVES

In each of the the following subsections, numberings of all formulas will


start from 1.

2.4. 1. Adamantane

Probably the most spectacular application of reaction graphs in organic


chemistry was connected with the mechanism of Lewis acid-catalyzed
isomerization leading to the preparation of adamantane, diamantane,26,27 and
diamond hydrocarbons. 28 Schleyer discovered these rearrangements 29,30 which
start from readily accessible isomeric saturated polycyclic systems having the
same number of rings (or cyclomatic number J..l = q - p + 1, where q and p are the
number of edges and vertices in the molecular graph of the hydrocarbon). Thus,
the tetrahydrogenated dimer of cyclopentadiene (2) rearranges to adamantane
(1) in the presence of aluminum chloride. The driving force of this
thermodynamically-controlled reaction is the relief of strain caused by
eclipsed conformations ; diamond hydrocarbons have the lowest strain among
all their isomers.
REACTION GRAPHS 149

It was shown in 1968 by Whitlock and and Siefken 31 that the conversion
of 2 into 1 may involve at least 2897 pathways ; they constructed a reaction
graph showing plausible intermediates. Fig. 16 selects only a few of these
pathways, based on the fact that all tricyclodecanes studied experimentally (3,
4, 5 and 6) did rearrange to 1. Schleyer and coworkers 32 calculated heats of
formation by means of molecular mechanics, and the corresponding values are
included in Fig. 16. They detected at -10 two intermediates in the aluminum
0

bromide-catalyzed isomerization of exo-8 (under much milder conditions than


from 2), namely 14 and 3. The lines in Fig. 16 represent the most favorable
pathways leading from each isomeric tricyclodecane C 1 oH 16 to adamantane 1,
according to Schleyer's group. The whole process then involves the reaction
sequence: 2 -> 7 -> exo-8 -> 14 -> 3 -> 1, in agreement with Whitlock and
Siefken. 31 Only the first step is endothermic, therefore more drastic conditions
are needed when starting from 2.

Fig. 16. Reaction graph for the formation of adamantane from various
precursors including the endo/exo tetrahydrogenated dimers of
cyclopentadiene (2). Negative numbers indicate heats of formation ;
thermodynamic stability increases approximately from bottom to top.
150 A. T. BALABAN

Strong Br0nsted acids may also be used as catalysts, as shown by


Schleyer's group. In order to test the intermediacy of certain structures, Ganter
employed deuterated alcohols with BF3 and ionic hydrogenation reagents such
as Et3SiH.31 The same reaction graph as that shown above, with the same 19
structures but with different numberings, was shown in Ganter's review as Fig.
17, under the title "Adamantaneland".31

Adlmlntaneland

'g f----
7J:o -- 'E
.12..) ,'0.1
.1,0

~ ~ ~

Th ~
t:6
"
'~
.,1,1
--- 54:3
·'0.4

1
·17 "U

8{} 2{j 16& r-- H


d3.3 ·u ·au .,...

~
ru U$ - 'Jj
11 '7

ill ·lU
I--
·~.I
...-
.,.. 1 .\1.7

t
'& '2& r-- '«/
15

&d.,1.. ·u .11.1 .13,'

Fig. 17. The same graph as in Fig. 16, but without attempting to order
structures according to their stability.31

2.4.2. Oiamantane

The isomerization of pentacyclotetradecanes with Lewis acids to


diamantane 1 is much more i.ntricated than the reaction discussed in the
preceding subsection because instead of considering 16 isomers as in Figures
16 and 17, one has to deal with > 40,000 isomers. The readily accessible
tetrahydro-Binor-S (2 or 3) isomerizes to 1 in about 70 % yield, and under mild
conditions one can detect several intermediates. Along with diamantane, a
disproportionation product 4 with a quaternary carbon atom is formed in minor
yield. When one starts from other isomers (7-10) that can be obtained from the
dimerization of norbornene followed by hydrogenation, diamantane becomes the
minor product and 4 is the major product (Fig. 18).5
REACTION GRAPHS 151

""~I ... ntr'O

.,.,·,0 Ir:IN ~""o

Fig. 18. Formation of diamantane 1 by Lewis-acid catalysis from various


precursors, including the two isomers 2 and 3 of tetrahydro-Binor-S.

By means of a selective reaction graph which generates at each step the


most stable subsequent products (according to empirical force-field
calculations), and by excluding the more strained isomers with quaternary
carbons, it was possible to arrive at the partial graph indicated in Fig. 19 which
shows the conversion of 2 or 3 into the identified intermediate 6.

into 6. Numbers under

From there on, it was necessary to include systems with quaternary


carbon atoms. The subsequent steps involved also calculations in which
carbocations were taken into account ; a critical parameter in this case was the
angle between the axis of the vacant orbital and the C-R bond, where R is the
migrating group: rearangements should be most facile when the angle is 0°. Fig.
20 presents the subsequent intermediate structures with full lines, leading
finally to 1. A global mechanism including neutral and cationic species is shown
in Fig. 21. It should be mentioned that the reaction graph which helped in
solving this mechanistic problem had over 400 vertices corresponding to
possible neutral intermediates (i. e., excluding carbenium ions). This means that
by calculating at each step relative energies of possible products, and by
selecting the product with lowest energy, it was possible to reduce the number
of vertices in the reaction graph by two orders of magnitude.
152 A. T. BALABAN

,,'
Fig. 20. Reaction graph for the rearrangement of 6 to diamantane 1 (full
lines). Broken lines indicate less probable pathways.
~~~
...
/"\.......Ii
._.~ ~

y'
~~~-~.
. ~'"

'~'--~.L-
.,;~_...
B
'0

~~g
r ,
Fig. 21. Reaction mechanism for the formation of diamantane 1 from
tetrahydro-Binor S (2 or 3).

Isomerizations of other isomeric starting materials (7 -1 0 in Fig. 18)


were also calculated similarly ; strain energies in these cases are higher, and
1,2-shifts eventually lead to the structures examined above.

2.4.3. Tetracyclotridecanes

When activation barriers are prohibitively high, such isomerizations to


the most stable systems with the same number of rings do not take place. Thus,
the presence of a methyl group prohibits the reaction from reaching the
corresponding C 13 H 20 "stabilomer" devoid of such a group.
Schleyer's group investigated systematically homologous series starting
with isomers of adamantane C 1oH 16 ' namely tri- and tetracyclic C 11 H 16 '
C 12 H 18 , C 13 H 20 , and C 14 H 22 saturated series. Here only the third of the above
four series will be briefly mentioned ; the next Sections will discuss several
C 11 systems.
On treating compound 3 at room temperature with AIBr3 or with its
"sludge" with tert-butyl bromide, the main product was the C 13 H 20 stabilomer,
1,2-trimethylene-adamantane 18 (96%) ; a second, minor (4%) product, which
equilibrates with the preceding one, is 2,4-trimethylene-adamantane, 19, as
shown in Fig. 22. The mechanism of this isomerization was established by the
procedure outlined above. 6
REACTION GRAPHS 153

Fig. 22. Reaction graph for the rearrangement of 3 to 18 and 19.

2.4.4. Tricycloundecanes

The rearrangement of tricycloundecanes C 11 H 18, particularly the readily


available isomers 30 and 31, to the stabilomer of this series,
1-methyladamantane, is described by a reaction graph in which 69 structures
and 251 interconversions are symbolized by points and lines, respectively. If no
alkyl groups would be present, the number of all tricyclic isomers would be
434, and if methyl groups would be allowed (but no other alkyl), this number
would raise to 2889. However, by excluding unlikely strained structures with
3- or 4-membered rings, the manageable number of 69 structures results.
Fig. 23 presents the still smaller main portion of the reaction graph 33
ending at homoadamantane 18 which is known to rearrange under comparable
conditions to 1-methyladamantane. Average calculated heats of formation are
indicated along with each formula. The most likely pathway from 30 and 31
involves the following intermediates: 60 -> 49 -> 47-> 63 -> 57 ->
40 -> 45 -> 18 ; this pathway is shown with thick lines. The thick circles
indicate intermediates that have been identified, and the thick broken circles
are for compounds confirmed not to be intermediates. Bond alignment (i. e.
dihedral angle) was again an important factor.
154 A. T. BALABAN

Fig. 23. Reaction graph for the rearrangement of tricycloundecanes 30 and


31 to homoadamantane 18.

2.4.5. Tetracycloundecanes

Among all 2486 possible tetracyclic systems which have the molecular
formula ell H 16 , 2,4-ethanonoradamantane and 2,8-ethanonoradamantane are the
most stable according to empirical face-field calculations ; indeed, they appear
as the final products (in 97 : 3 ratio) of AIBr3-catalyzed isomerizations
starting from several available tetracycloundecanes. A reaction graph 34
interconnecting 15 isomers was helpful for identifying several intermediates.

2.4.6. Spiro[adamantane-2, 1 '-cyclobutane]

Recently, it was shown by Farcasiu et al. 35 that spiro[adamantane-2,1 '-


cyclobutane] 5 rearranges in the' presence of AIBr3 to a mixture of
2,4-trimethyleneadamantane 6 and 1,2-trimethylene-adamantane 7 (which
were denoted by 19 and 18, respectively, in the previous subsection). By means
of the same procedure, namely stepwise evaluation of the strain energy for each
successive step, a simplified reaction graph for this rearrangement was
obtained (Fig. 24) ; it starts with the cation 12 derived from 5 by hydride
abstraction. The unusual feature of this graph is that the formation of 6 does
not proceed via 1,2-shifts, but via a direct 1,3-shift (i. e. the transanular
2,4-shift of the polymethylene bridge).
REACTION GRAPHS 155

-n'--...
~
I ffJ
E:f>=f;r LN-
.. !!. J~.
'HI
ti
H J~ '·1'"

"tA--.yjl
-D:'i1 ~ ,~,7'
,ll
,;;:.
~N
b I:k:I _
..... ~N ,;Q'l' "l';.!ll
Ed"'""
",~~,
,"
. 8.::;-"
,:~.
II"
D-=g- 'iF-
(~):J. (S~-.:fi (BI:2~
i)-
(1l);J.
Fig. 24. Reaction graph for interconversions of the carbocation 12 derived
from 5_ Small numbers in brackets indicate calculated heats of formation.
Numbers along arrows indicate dihedral angles for the bond alignment (the
lower these angles, the more easily the 1.2-shifts take place).

3_Automerization of bullvalene, other valence isomers of annulenes,


and azabullvalene

Doering and Roth's 36 brainchild bullvalene was synthesized by Schroder


and coworkers 37 soon after the idea of bullvalene was published. This valence
isomer of [10jannulene has three 1,5-pentadienic substructures and thus each
structure can undergo three automerization interconversions by Cope
rearrangements (Fig. 25). Eventually, all ten CH groups become equivalent. These
rearrangements occur so rapidly that 1 H-NMR spectra of bullvalene at 100 D C
produce a single CH peak ; at lower temperatures (-25 D C), one can see the four
CH peaks with intensity ratios 1 : 3 : 3 : 3.
The reaction graph for the bullvalene rearrangement ("Monster
graph") has 10!/3 = 1,209,000 vertices, or half of this number, depending on the
mode I. 38.39 It has girth 12.

2
3(i)IO
I -.::. \
4

7 6 9

( 8

l J
4

Fig. 25. The three possible Cope rearrangements of bullvalene 1 :


substructure 3,2,1,7,6,5 leads to 2 ; 3,2,1,8,9,10 leads to 3 ; and
5,6,7,8,9,10 leads to 4.
156 A. T. BALABAN

Fig. 26 presents a portion of the Monster graph showing one of the


smallest circuits with 12 vertices. Another related huge reaction graph will be
found in Section 7 on heptaphosphide trianion.

Fig. 26. A portion of the reaction graph for the automerization of


bullvalene showing one of the 12-membered circuits (girth).

Azabullvalene (Fig. 27) has a much simpler reaction graph for its
automerization owing to the higher stability of structures with sp2-hybridized
nitrogen, therefore structures where the nitrogen atom is not adjacent to a
double bond are forbidden energetically.

2 10 o 6

'~
6 4 5
872
391
456
451
392
870

Fig. 27. Azabullvalene with its notation, and one rearrangement.

The resulting graph has 28 vertices : 14 of degree 1, and 14 of degree 3.


In Fig. 27, nitrogen is always numbered 1 (therefore digit 1 always appears in
the 2nd or 3rd lines of the notation), and 0 stands for 10. It can be seen that the
smallest circuit has 14 vertices. 17 The graph makes it easy to understand that
at lower temperatures the only observable isomerization is that presented in
Fig. 27 along with the corresponding notation ; at more elevated temperatures
(70-200°), all 28 isomers rearrange rapidly at the NMR time scale leading to
coalescence of the NMR peaks, as indicated in Fig. 28.
REACTION GRAPHS 157

Fig. 28. Reaction graph for the automerization of azabullvalene.

Cyclooctatetraene (COT) can undergo rapidly a ring inversion and a bond


shift (the latter is an automerization) and also a valence isomerization to
bicyclo[4.2.0]octatriene and to a tetracyclic isomer. as shown in Fig. 29. The
scrambling of carbon atoms requires higher temperatures. As a result. each COT
molecule can become scrambled into four other COT molecules (Fig. 30).
therefore the reaction graph describing only this process 17 is regular of degree
4. It has 8 vertices and its edges are labeled with the double bond that
maintains its integrity during the corresponding automerization (Fig. 31) .

. .~ 70-
•• '"<..-l",

.~ 81

6l6~3
J2.J.::.
8
7~2 ~
@
V
2
64
1
8

~ 7~1
6~4
8 2

-- 7~2
6
67
8 1

...:
l
~ - 54 5 54
...:;0;- 5 4 5

Fig. 29. Valence isomerization of COT into bi- and tetracyclic systems.
158 A. T. BALABAN

6 t 8 t 8 2 4 3

~~: ®2 7(iJr
72 5®t

-
~ I': 3 8 t 8 t 6 _ 4 6:-; ;;:

70l~ ~O;~
7 7 7
4 3 4 5 5 3 8

6(t}8 7®3
1 8 1 6~/.3- 8 t

~181': 2:(f)~
~1
5 4 5 4

3
5 _ 3 6 ~ I': 2 6
4 2 4 5 4 6 4

Fig. 30. Portion of the reaction graph for the automerization of COT
,,
-78-0-
. ,ly

·, ..
_ 1'l

'-
~'I
'0-
S _ I1' --+--7'-:"""-:'-<---+-- '0-
,. 11•

~. .~
'0
"'. 1' 0--"1,
(;1-'\

.'04,;. .
S _ J ',' _
4

"7'0-1
2 '\.24

5 _ I'l _3_8_ , _
17
1'l
-~-!/'

, 1 '51

Fig. 31. Reaction graph for the automerization of cyclooctatetraene.

If bond-shift reactions are also considered, the resulting graph is more


complicated and is regular of degree 8.

4.Rotation in molecular propellers

Brief mention should be made of molecules whose steric barriers to


internal rotation are so high that individual rotamers may become stable enough
to be isolated, so that their interconversions may be observed directly.
Although no bonds are broken or formed, the interconversions of such
rotamers can be described by reaction graphs if we incorporate
"interconversions of distinct substances" into "reactions", by relaxing
correspondingly the definition of reactions. Because the central atom of an
ortho-substituted triphenylborane with three different substituents is planar,
there exist eight diastereomeric pairs of enantiomers represented in Fig. 32
with their interconversion graph, when each of these interconversions involve
rotation of one ring. If the helicity is reversed, another cube graph is obtained.
An analogous triarylmethane instead of the triarylborane gives rise to 16
diastereomeric pairs of enantiomers. Mislow and coworkers examined in detail,
theoretically and experimentally, these systems. 40 Rotation barriers around 20
kcal/mol were determined by dynamic NM R spectroscopy. For a substituted
di-a-naphthyl-phenyl-methane, stable rotamers could be isolated ; barriers of
about 30 kcal/mol were determined by following the kinetics of the
interconversion.
REACTION GRAPHS 159

Fig. 32. Reaction graph for the rotation of aryl rings in triarylboranes
with different substitution patterns in each phenyl ring.

Stereochemical correspondences were established between these


interconversions and those (to be discussed in section 5.3.1) of nonrigid
metallic trischelates : the three-ring flip corresponds to the Bailar twist, and
the two-ring flip to the Ray-Dutt twist.

5.Reaction graphs for rearrangements of metallic complexes

5. 1. PENTACOORDINATE COMPLEXES

5.1.1. Trigonal-bipyramidal complexes.

A development of reaction graphs applied to inorganic and organometallic


chemistry, paralleling the previously described 1 organic-chemical applications,
started soon afterwards with a paper by Lauterbur and Ramirez. 41 These authors
discussed the interconversions of trigonal-bipyramidal (TBP) compounds with a
central pentacoordinated phosphorus atom. The five ligands are of two types:
two axial or apical ones which are collinear on two sides of the phosphorus
atom, and three equatorial ones lying in an orthogonal plane to the trigonal axis
and forming angles of 120 0 between them (Fig. 33, with notation similar to that
of Fig. 1).
4
--=:!-- (.15)
~
2+- 3 (23.)
1 '5 -::s-- 1.14)
--f! -
4 4
..2 .-:L- (.25)
1 '2 '~3
2 5
(13.)
-:s-- (.24)
~

'*2
5
4
(.45) ~(.35)

5 3
112.)
-:s-- (.34)
Fig. 33. Portion of the reaction graph for the pseudorotation of a
phosphorane with five different substituents.
160 A. T. BALABAN

If all five ligands are different, then there are C / = 5! I 2! 3! = 10 pairs


of possible TBP enantiomers. The 20 stereoisomers are designated by the
numberings of the apical ligands ; thus isomer 12 has ligands 1 and 2 as apical,
and 12 is its mirror image, as shown in Fig. 34.

I I I

l_~,,'4 I .. 4 I .,3
1'5
\2
2-', Il
1 5
2-'"
1 5
14

2 3 4

1 I I
1 ,,5 _ 1 ,~s
2-P, IT
1,,5 -
2-', 14
3-P, 12
1 4 1 4 1 3 4
2 3 4
I ,2
1 - ' , 45
2 2 1 3

1-',
1 ,-4
23 I-P'
1.-3 24 S

1 5 I' 5 4
3 4

2 2
1-',
1 .• 3

1 2
_
45
I
1 ,4 _ 1,.5 _ 1 .5 - 5
2-r'::' 3 IS 1-',
I
23 4
1-',
1 3
24

5 3 4

1-\'4 2S
1 ,.3

3 3
2
1 ,.4 - 1,,5 -34
1_',
1.,4
1-',
1 3
25
I 2
I-r'l 35
5 4 5

Fig. 34. The 20 configurations of a phosphorane with five different


substituents 1 - 5 and their notation according to Muetterties or Gielen.

These 20 structures can be interconverted by Berry pseudorotation (Fig.


35),42 or by turnstile rotation. 43 -45 The resulting reaction graph in both cases is
the Oesargues-Levi graph. If stereoisomerism is disregarded, there remain 10
constitutions, and the resulting graph is the Petersen graph.46 It was Mislow in
1969-1970 who pointed out the equivalence between the two independent
approaches of organic and inorganic reaction graphs. 47 -49
REACTION GRAPHS 161

Fig. 35. The mechanism for the Berry pseudo rotation ; it can be seen that
transition states between TBP configurations are tetragonal pyramids.

Several different representations for the Oesargues-Levi and Petersen


graphs have been published in the context of TBP rearrangements. 3.50 Before
presenting them, one must discuss notation for TBP complexes.
The notation for stereoisomeric TBP complexes advocated by Muetterties
51·53 and adopted by most chemists indicates the numbers of the apical ligands
in increasing numerical order ; if on looking from the apical position with the
smaller symbol to the three equatorial ligands taken in increasing numerical
order these appear in a clockwise sense of rotation, then the notation is final;
otherwise (for a counterclockwise sense of rotation), a bar is placed over the
two digits. Fig. 34 presents these 20 stereoisomers of a TBP complex with
ligands 1 - 5, and Fig. 36 shows several representations of the Oesargues-Levi
graph for Berry or turnstile processes.
The Oesargues-Levi graph is bipartite, i. e. its vertices can be divided into
two sets such that vertices in one set are adjacent (i. e. connected) only to
vertices in the other set. The necessary and sufficient condition for a graph to
be bipartite is to have no odd-membered circuits, and the Oesargues-Levi graph
fulfills this condition. Representations of the Petersen graph are identical to
those of Fig. 5.
Modes of rearrangements for such complexes were enumerated by Musher
54.55 using ideas previously discussed by Muetterties,51·53 as well as by Gielen,
Nasielski and Brocas. 56 -60 The Berry and turnstile rotation modes were denoted
by Musher as mode M1 (3) ; the subscript 1 indicates the number of ligands
conserving their position (one pivot ligand makes these processes
monoligostatic in Cram's terminology) ;61.62 the number in round brackets
indicates how many isomers may be formed in one step from a given structure,
L e. the degree of the regular reaction graph for the topological representation
of the rearrangement.
162 A. T. BALABAN

Fig. 36 presents the reaction graph for the Ml (3) mode of rearrangement
with the Muetterties notation ; the resulting Oesargues-Levi graph is
illustrated in several isomorphic representations.
"!l ..
"

.. " " "


fJ

.
j4

l5

.,
"

1/
"

Fig. 36. Several representations of the Oesargues-Levi graph for TBP


structures with the Muetterties notation.

It is apparent that there is no simple rule associated with the Muetterties


notation which reflects the bipartite nature of this graph. It was shown by
Balaban,63 however, that a minor charge in the Muetterties notation could
achieve this goal : on reverting the assignment of the two enantiomers
whenever the difference between the two apical symbols is an even number, one
obtains a notation in which the vertices of the Oesargues-Levi graph with a
barred notation are adjacent only to vertices whose notation has no bar, and
vice-versa. This actually means to revert the assignment (bar versus no bar) for
the three following pairs of vertices : 13 and 13 ; 24 and 24 ; 35 and 35. Among
all rearrangements modes, Ml (3) appears to be the most likey in the absence of
steric constraints, and from its two possible mechanisms, the Berry
pseudo ration seems to involve a smaller activation energy than the turnstile
rotation.
Other rearrangement modes lead to other reaction graphs. Thus Musher's
M 5(3) mode is represented by two disjoint Petersen graphs, one for each
enantiomeric series.
Musher's M2(6) process leads to a reaction graph which is bipartite,
regular of degree six, and which has girth four.63 In Fig. 37 it is presented using
Balaban's modified notation.
REACTION GRAPHS 163

/4
Fig. 37. Reaction graph for rearrangement mode M2(6).

For Musher's M4(6) mode one obtains two disjoint regular graphs of degree
six and girth three for the two enantiomeric series, as shown in Fig. 38 for one
of these. 63

/2

Fig. 38. One of the two identical subgraphs representing the


rearrangement mode M4(6).

It is interesting to emphasize that the reaction graphs for modes M1 (3)


and M2(6) are complementary, and also for modes Ms(3) and M4(6) the reaction
graphs are complementary :63 this means that adjacent vertices in one graph
become non-adjacent in its complement, and vice-versa. In other words, in the
adjacency matrices, entries 1 are replaced by 0, and vice-versa.
164 A. T. BALABAN

5.1.2. Tetragonal-pyramidal complexes

A different stereochemistry of pentacoordinated complexes is present in


the tetragonal-pyramidal (TP) complexes which are less frequently encountered
than trigonal-bipyramidal ones : when a Mo or W atom is coordinated to an
If-cyclopentadienyl ring and to four un identate ligands, the ring is at the apex of
a TP structure, as shown by X-ray crystallography of (CsH5)Mo(CO)3Et. If two
pairs of unidentate ligands are present, two cis-trans isomers result, which
were shown to be stable at room temperature, but rearrange on heating, as
shown by the NMR spectra of (C5H5)Mo(CO)2(PPh3hI.6S
One should emphasize that transition states for rearrangements of TP
complexes have TBP structures, and vice-versa, the Berry pseudo rotation
mechanism for rearrangements of TBP complexes leads to a TP transition state.
A notation for TP complexes (fig. 39) can consist in three symbols :65 the
first one indicates the apex ligand followed by a comma ; the subsequent two
symbols indicate the last two ligands of the square base in the sequence
resulted when one looks from the apex and starts with the lowest digit in
anticlockwise rotation sense. Thus, Fig. 39 indicates the notation for two
enantiomeric TPs. If enantiomerism is ignored, then the last digit can be
dropped out, and the stereochemical notation consists of only two symbols
separated by a comma. There are 30 isomers when all five ligands are different.
1,2)45 1,2543 Full notation
1,45 1.43 Stereoehemlcal notatl.on
1,4 1.4 Consti tllticnal notation

5
~3 @' 2 2 5

Fig. 39. Two enantiomeric TP complexes with three types of notation

Ruch and Hasselbarth,66 using double cosets, showed the various modes of
rearrangement for TP complexes. For the standard complex of Fig. 39 there are
seven possible modes. Mode 1 is trivial (identity permutation, no
rearrangement). Mode 2 is also trivial since it involves permuting two opposite
ligands at the base of the TP, i. e. racemization. The reaction graph is a forest
of 15 disconnected edges linking together pairwise the enantiomers. Mode 3
leads to five disconnected regular subgraphs of degree 4 with six vertices each,
,;oce the ape, ;, uochaoged. Doe of the'e ;, ,howo ;0 Fig. 40;"A'"

5.23~5.'3
5.32
Fig. 40. Subgraph of the disconnected reaction graph for rearrangement
mode 3 of TP complexes.
REACTION GRAPHS 165

Mode 4 also leads to a reaction graph which is regular of degree four and
girth five. but now it is a connected graph which is represented in Fig. 41 with
five-fold symmetry. and in Fig. 42 with a Hamiltonian circuit with three-fold
symmetry. Randic et a1. 6 ? found a canonical numbering for this reaction graph
which leads to the smallest binary number when reading sequentially its
adjacency matrix. They also determined that the order of the symmetry group is
240.

Fig. 41. Reaction graph for rearrangement mode 4. 65


28

26 10
2

,,{~"7f:.~~fYI5
......".,~..-.2" 8
29 •
Fig. 42. Reaction graph with Hamiltonian circuit and canonical numbering
for mode 4. 65

If enantiomerism is neglected. a regular graph of degree four with 15


vertices and girth three is obtained for this reaction mode 4 (Figures 43 and
44).

1.5

Fig. 43. Reaction graph for rearrangement modes 4 and 5 of TP complexes


neglecting enantiomerism. 65
166 A. T. BALABAN

•..

Fig. 44. Reaction graph with Hamiltonian circuit for rearrangement modes
4 and 5 of TP complexes neglecting enantiomerism.

Rearrangement mode 5 leads to a 3D-vertex graph of degree 4 and girth 3


shown in Fig. 45 in three isomorphic representations : (i) with tenfold
symmetry ; (ii) with a Hamiltonian circuit and 0 6 sixfold symmetry ; and (iii)
with a Hamiltonian circuit and lower (C 6 ) sixfold symmetry, but with
diametrally opposite enantiomeric structures.

Fig. 45. Reaction graph for rearrangement mode 5 of TP complexes shown


in three isomorphic representations.

Rearrangement modes 6 and 7 lead to reaction graphs which are regular of


degree 8. For each of these, representations with Hamiltonian circuits or with
high symmetry were found (Figures 46 and 47).
REACTION GRAPHS 167

Fig. 46. Reaction graph for rearrangement mode 6 of TP complexes.

S.13

Fig. 47. Reaction graph for rearrangement mode 7 of TP complexes.

In both cases, when enantiomerism is disregarded, one obtains one and the
same 15-vertex reaction graph which is regular of degree 8, as shown in Fig. 48.
3.5

1.5

'.3 2.5

Fig. 48. Two isomorphic representations of the reaction graph for


rearrangement modes 6 and 7 of TP complexes ignoring enantiomerism.
168 A. T. BALABAN

5.2. TETRACOOROINATE COMPLEXES

The racemization of a tetrahedral organic compound or of a tetrahedral


inorganic complex with four different ligands attached to the central metal
atom (disregarding any transition states or intermediates, such as carbenium
ions or carbanions in the former case) has a very simple topological
representation. Since there are 4!/(lTI) 24/12 = 2 stereoisomers
(enantiomers), the topological representation is just one edge with its two
endpoints symbolizing the two enantiomers.
If (as it is plausible for non-bond-rupture mechanisms in inorganic
complexes) the transition state in such isomerizations is a square-planar
complex devoid of chirality, then there can be 4!/(1041) = 24/8 = 3 such square
planar isomeric transition states in which ligand 1 can have ligands 2, 3, or 4 in
trans. The reaction graph for such a process involving also the possible
transition states is presented in Fig. 49. 68

Fig. 49. Reaction graph for the rearrangement of tetrahedral complexes via
square planar transition states.

5.3. HEXACOOROINATE COMPLEXES

5.3.1. Octahedral complexes

The most common inorganic complexes of transition metals have


octahedral structure, with the central metal atom connected to six ligands ;
their sterochemistry was extensively investigated beginning with Werner in the
last years of the preceding century.
When all six ligands are different, there exist 6!/4! = 30 isomers, i. e. 15
pairs of enantiomers just as in the case of TP complexes with five different
ligands. Each of these 30 stereoisomers can be specified uniquely by the
following four notation rules resulting in a 2-digit symbol :
1) Number the ligands (substituents) with digits 1 through 6.
2) Select from the eight triangular faces of the octahedron the face giving
the lowest number when reading the ligands clockwise.
3) The other three ligands are indicated afterwards, turning in the same
direction, and beginning with the ligand which completes a triangle with the
first two cited ligands.
REACTION GRAPHS 169

4) Indicate the last two digits in each sequence of six digits ; for each
pair of enantiomeric structures. compare their numbers. then take the lower
number for one enantiomer and the same number with a bar above it for the
other enantiomer.
The resulting notation may be restated as follows : the first digit
indicates the ligand trans (opposite) to ligand 1. whereas the second digit
indicates the ligand trans to SUbstituent 2. unless this substituent 2 is trans
to ligand 1. in which case the second digit indicates the ligand trans to
SUbstituent 3. Thus. digit 1 never appears in the notation.
These rules due to Muetterties. 51 - 53 Gielen and their coworkers 6.56-60
lead to the following 15 groups of 2-digit symbols : 24. 25. 26 ; 34. 35. 36 ;
43. 45. 46 ; 53. 54. 56 ; 64. 64. 65 . When bars are added. one obtains the 30
enantiomers depicted in Fig. 50. The distinction between the two hydride
ligands denoted by H5 and Hs (in addition to the four ligands denoted by L1
through L 4 ) leads to dividing the 30 isomers into 6 trans and 24 cis
stereoisomers. Enantiomers are depicted one above the other in Fig. 50.

L. L. L,
~.
~ •. _L. ~~ •• L) ~ ../ l 2
:?t<~
H.... H,_ •• Ht ..•
H." I ~LJ I •
H"" ..... L
o H," , . . . . t,
L, L, L, L,
E 16 n 36

L, L, L, 1,
H,._ /L,! H.;;~<L. H._._~ ..... L~ H,. __ ~ .... ",l.
H,
"";M~L
I ' H, r L, Hs" ,--L, H,....... ' ...... L,
L, L, L, L,
2S 26 36
!. . .
35 L. L, L,
H. __ . ' , ... L, H, •.. Lt H•.. _~ ... _ L,
J___ .,
L, L, L, .... M_
·L ___ I
~,
L. H. l/"",-H, L/,'" , ........ H,
H.... L, ~_/L2
H,',--L,
HI ... H,._.~_ ...... L, H· .. L.
H,'" ,---L, H•.,.....' ...... L, H," , ....... L. L,
2' 34
L.
'3
L. L. L. L,
4S 6)

.,
46 SJ L, L, L,
H···_L· . . HI._.~ __ .. LJ _~_ ... _LI
H, ...
L,
~", ...... L2
L,
H._._~ __ . L, H.... !___
L,
L. H. ___ ~ ../LI
L." .,,-H,
·L,
L."" . . . H.
Ht ••
L,""'" , . . . . . H,
H, ..... i -L,
I
H,;,-L1 H," , ....... l\ H.',-l, 2i
L,
li
L.
41
L. L. L, L,
(b) Trails
45 46 6J 53
t,
l·.
L, L.
J. . .
L.
H.· __ ·L\ H••.• '. __ L, _~---LI
HI •• H._,. L,
1--" H/'... , . . . . . L, ","'-L
H ,M ....... L
H."" • I ' I I
L. L. L. L;

.,
6i 54 65 56

L, L. L.
H,_. I .. L
';M:""' L
J H. __ I __ L, H•• _.' ..... Ll H,_ f .. l.

I ~M'L
I '
.... M_ L
I ;';M:'::" L

.. ". I '
HI I H, Hs I

L. L. L. L.
54 is 56
(.) 01

Fig. 50. Structures and notation of hexacoordinated complexes with six


different ligands 1 - 6. If two of these are of another type than the
remaining four. cis - trans isomers result.
170 A. T. BALABAN

Two isomerization processes lead to the same result : the Bailar trigonal
twist 69 rotates one triangular face of the octahedron relatively to opposite one
; the Ray-Dutt rhombic twist 70 involves rotation of four ligands on two faces
of the octahedron sharing one edge. Such isomerizations convert 56 into 25, 26,
34, 35, 43, 45, 63, and 64 (connectivity 8 for the resulting reaction graph of
girth 4). This graph is represented with Hamiltonian circuits in Fig. 51 with
three-fold symmetry ; one can find isomorphic representations such that
enantiomers appear in antipodal (diametrally opposite) positions.

1(~~i!i1ft1f\

jj~~~~

IS

Fig. 51. Reaction graph with Muetterties notation for the trigonal (Bailar)
or rhombic (Ray-Dutt) twists.

On reverting the asignments and the notation for 24, 35 and 53 relative
to the Muetterties-Gielen conventions, as was advocated by Balaban,71 one can
obtain for this rearrangement a reaction graph (Fig. 52) which has a six-fold
symmetry, a Hamiltonian circuit with diametrally opposed enantiomers, and
two sets of vertices for the bipartite graph which are distinguished by the
presence or absence of a bar.

Fig. 52. Representation of the Bailar or Ray-Dutt twists using Balaban's


notation.
REACTION GRAPHS 171

By ignoring enantiomerism, one obtains a regular graph of degree 8 with


15 vertices and girth 3 , depicted in Fig. 53.
II

IS

Fig. 53. Reaction graph for the trigonal or rhombic twists ignoring
enantiomerism (Muetterties and Balaban notation becomes equivalent).

A less plausible isomerisation pathway is the digonal twist (a


tetraligostatic process). Two cis ligands are interchanged. leading to loss of
chirality. The resulting reaction graph (disregarding stereoisomerism) has 15
vertices, is regular of degree 6, and has girth 3. It is represented in Fig. 54.

Fig. 54. Reaction graph for the digonal twiSt. 71

An alternative representation with five-fold symmetry, due to Jones and


Lloyd,72 is shown in Fig. 55. The symmetry group of the graph is S6 and the
order of this group is 720. Randic and Davis found a canonical numbering
(displayed in Fig. 56) resulting in the smallest binary number when the
adjacency matrix of this graph is read sequentially.7 3

02

Fig. 55. An isomorphic representation of the graph from Fig. 54 with


canonical numbering and five-fold symmetry.
172 A. T. BALABAN

..
Fig. 56. Canonical numbering for the graph from Fig. 54.

If enantiomerism is not ignored, a 30-vertex reaction graph is obtained :


it is regular of degree 12, has girth 3, and is displayed in Fig. 57 with sixfold
symmetry and a Hamiltonian circuit.

Fig. 57. Reaction graph for digonal twists with Balaban's notation.

5.3.2. Axially distorted octahedra

Because of the Jahn-Teller effect (related to Peierls distortion), d 1, cJ9,


high-spin d 4 , and low-spin d 7 metal complexes of Cu(II), Fe(III), V(IV), Cr(II),
CollI) and Mn(lll) adopt a structure with two axially (diametrally) situated
ligands at higher distance from the central metal atom than the remaining four
equatorial ligands, as shown in Fig. 58. The notation for each isomer lists the
two axial ligands starting with the ligand having the lower number : a third
digit is then added, corresponding to the ligand which is trans to the equatorial
ligand of the smallest numerical value. In Fig. 58, the central isomer has
notation 164.

'+". ,./
H4 )U "'2 565

• •
.,./ '/ \
12' 1)4 14' I~

Fig. 58. Notation for axially distorted octahedra and their


interconversions.
REACTION GRAPHS 173

By permuting pairwise one axial ligand and an equatorial ligand (digonal


twist), each structure can be converted into eight other structures. The
isomerization of such axially distorted octahedra gives rise to a reaction graph
of degree 8 with 45 vertices, representing the 45 possible isomers when the six
ligands are different. Fig. 59 presents the reaction graph with a Hamiltonian
circuit and the above notation ; this graph is vertex transitive and can be shown
also in a more symmetrical form with a fivefold rotation axis.? 4

~
_ .R:,,'.
3~ ~~~~~~~~~~"J
~ I~

..,
\3$

,A$

Fig. 59. Reaction graph for isomerization of axially distorted octahedra.

5.3.3. Trigonal prismatic complexes

Trigonal prismatic
complexes with 6 different ligands give rise to
6 !/( I0 3 1) = 720/6 = 120
stereoisomers. The resulting reaction graph is too
complicated to be presented here. Even when enantioisomerism is ignored and
the reaction graph remains with only 60 vertices, the picture is fairly
intricate.? 5
The transition states for the Bailar or Ray-Outt twists have trigonal
prismatic structure, but they are not usually taken into consideration in the
reaction graphs.

5.4. OCTACOOROINATE COMPLEXES

From the octacoordinate complexes, those with cubic and square


antiprismatic structures are interesting ; we shall discuss only the former
ones.

5.4.1. Square antiprisms

There are 257 polyhedra with eight vertices ; 14 of these are deltahedra,
having only triangular faces. The square antiprism coordinate complexes with
two square and eight triangular faces can give rise to 14 modes of
rearrangement, consisting of seven pairs of enantiomeric modes. Three reaction
graphs for four of them are shown in Figures 60 and 61. 76
174 A. T. BALABAN

13

19
Fig. 60. Reaction graphs for modes 9 and 10 : they are regular of degree 8,
and have 12 and 24 vertices, respectively.

10

Fig. 61. Reaction graph for modes 3 and 7. It is regular of degree 4 and has
24 vertices.

5.5. OTHER INORGANIC COMPLEXES

The pentagonal bipyramid with 7 different ligands gives rise to 7!/(05) =


5040/10 = 504 stereoisomers. The hexagonal bipyramid, the square antiprism,
and the bisdisphenoid with eight different ligands give rise, respectively, to
40320/12 = 3360, 40320/8 = 5040, and 40320/4 = 10080 stereoisomers
grouped in pairs of enantiomers. Their interconversion graphs are too
complicated for a detailed analysis, but King 77 .78 devised a method for studying
such isomerizations using Gale diagrams. If we denote by Pad-dimensional
polytope (which corresponds to a polyhedron when going from a 3D to a
multidimensional space) or its graph with p vertices, a Gale transformation
leads to a Gale diagram of P consisting of p points in (p - d - 1)- dimensional
space in one-to-one correspondence with the vertices of P. From Gale diagrams
it is possible to determine all the combinatorial properties of P. If p is not
much larger than d, namely if p s: 2d, then the dimension of the Gale diagram is
smaller than that of the original polytope P. For more details, the original
papers of King should be consulted. So far, organic-chemical applications of
Gale diagrams have not yet been published.
REACTION GRAPHS 175

6.Xenon hexafluoride

A different distortion from that described in Section 5.3.2. exists in the


geometry of XeFs : the presence of a lone electron pair on the central atom
enlarges one of the six triangular faces of the octahedron, giving rise to a
geometry with C 3v symmetry point group. The electron pair jumps from one
face to an adjacent one. There are 52 modes of rearrangement which can be
grouped into 26 pairs of enantiomeric modes ; some of these give rise to
reaction graphs with 120 or 240 points which are too large to be represented
conveniently. A smaller one will be discussed in the following.7 9 Self-inverse
(SI) graphs are non-directed graphs (like all graphs discussed so far), but non-
self-inverse (NSI) graphs are directed graphs (digraphs). An example for the
latter type of such graphs is given in Fig. 62. For the notation, the three atoms
on the large octahedron triangle are listed clockwise starting with the smallest
digit ; one then adds a period followed by the labels of the small opposite face
in clockwise order, starting from the ligands of the small face which is
between the first two ligands on the large face. In Fig. 62, a shorter notation
with letters 79 is employed.

Fig. 62. Reaction digraph for mode U12 ; it has 40 vertices.

7.Heptaphosphide trianion

The P7 3 · trianon has a structure with trigonal symmetry ; both the


structure and the automerizaton into three other structures indicate strong
simmilarity to bullvalene. The reaction graph for this rearrangement consists
of 71/3 = 1680 vertices. This cubic graph is far too large to be conveniently
presented. It was determined by Randic et al. 38 that it has girth 10 and that
the order of its symmetry group is 2 x 7!.
176 A. T. BALABAN

8.Kinetic graphs, synthon graphs, and graph transforms

The three title graphs are similar in principle to reaction graphs in that
all these types of graphs have vertices symbolizing chemical species
undergoing reactions. However, vertices of the same graph do not represent
isomeric species as in the case of reaction graphs ; in kinetic graphs, vertices
symbolize intermediates, and edges symbolize their interconversions, while in
synthon graphs and graphs transforms, vertices symbolize successive steps in
building up a molecule. Another feature differentiating such graphs from
reaction graphs is that usually the latter graphs are simple, non-directed
graphs, whereas kinetic and synthon graphs are directed graphs (digraphs), i. e.
their edges have a direction (they are, therefore, properly called arcs).
With the aid of kinetic graphs one can derive the kinetic equations,
analyze, and solve them. Further information on kinetic graphs may be obtained
from reviews by Bonchev and Temkin,80-83 and by Yatsimirskii. 84
A few more words on synthon graphs are necessary. This term was
introduced by Hendrickson,8S-87 but Corey and Wipke 88-90 were the first to
devise programs for computer-assisted organic synthesis. This field has
expanded rapidly, and a rich literature exists. 91 -98 Synthon graphs indicate how
the synthons are assembled, while a related type of graph (optimal planning
graph) shows the order in which synthons are introduced in order to build up the
target molecule. It is well-known that for the same numbers of steps and yields
per step, a convergent synthesis (branched graph) gives a higher overall yield
than a sequential synthesis (linear non-branched graph). Fig. 63 presents two
examples for obtaining a steroid from four synthons ; the order in which the
four synthons 1 - 4 are assembled is shown in the bottom row ; the number of
carbon atoms in each synthon is written to the left of the synthon notation
(1 - 4) on the bottom row where the target molecule is denoted by a white
point. Assuming yields of 90 % for each step, the "synthetic tree" on the left has
a global yield of 0.9 4 = 66 % while the one on the right has an overall yield of
0.9 6 = 53 %.

""l:'~
00')' Target
molecule

Synthon
:~ graphs
~:
Optimal cm2~
C3 ,
C2W '2; - ;
C planning
c, J graphs c, 1
c2 ,
c, J

Fig. 63. Synthon graphs (second row) for the same steroid target molecule
dissected differently.
REACTION GRAPHS 177

Finally, M. Johnson's graph transforms must be mentioned in the context of


modeling chemical reaction pathways. Digraphs whose vertices are themselves
graphs are called metadigraphs ; the arcs of such metadigraphs are the graph-
theoretic counterparts of chemical reactions. 99 By means of this approach,
Johnson was able to model biochemical (metabolic) reactions for predicting
drug metabolites. 1oo

9.Conclusions

This is the first review dedicated exclusively to reaction graphs. Nearly


all reaction graphs are non-planar, i. e. they cannot be drawn on a plane without
crossing lines.
This brief description of reaction graphs has shown some of their
applications in organic and inorganic chemistry. The most significant
applications in organic chemistry were in the area of cationic rearrangements
leading to diamond hydrocarbons and their related compounds, elucidating the
mechanisms and indicating likely intermediates. In inorganic chemistry, the
most significant applications have been in the area of rearrangements of
complexes with various geometries, where again reaction graphs contributed in
highlighting paths for rearrangements.
For earlier brief reviews on chemical applications of graph theory which
included data on reaction graphs, references 101-103 may be consulted.

Acknowledgements. The support of Drs. D. J. Klein, W. A. Seitz and T. G.


Schmalz from the Texas A & M University, and fruitful discussions with Dr. D.
Bonchev are gratefully acknowledged. For permission to reproduce figures,
thanks are addressed to the American Chemical Society (Fig.15,18-24, 32), to
McGraw-Hili, Inc. (Fig. 34, 35), and to John Wiley and Sons, Inc. (Fig. 8, 50, 55,
56).

References
1. A T. Balaban, D. Farcasiu, and R. Banica, Rev. Roum. Chim., 1966, 11, 1205
2. C. J. Collins, C. K. Johnson, and V. F. Raaen, J. Amer. Chem. Soc., 1974, 96,
2524.
3. M. Gielen, in Chemical Applications of Graph Theory (AT. Balaban, ed.),
Academic Press, London, 1976, p. 261.
4. J. Brocas, M. Gielen, and R. Willem, The Permutational Approach to Dynamic
Stereochemistry, McGraw-Hili, New York,1983.
5. T. M. Gund, P. v. R. Schleyer, P. H. Gund, and W. T. Wipke, J. Amer. Chem. Soc.,
1975, 97, 743.
6. E. Osawa, Y. Tahara, A. Togashi, T. lizuka, N. Tanaka, T. Kan, D. Farcasiu,
G. J. Kent, E. M. Engler, and P. v. R. Schleyer, J. Org. Chem., 1982,47, 1923.
7. J. Petersen, Acta Math., 1891, 15, 193.
8. F. Harary, Graph Theory, Addison-Wesley, Reading, MA, 1969.
9. P. K. Wong, J. Graph Theory, 1982, 6, 1.
10. W. T. Tuite, Connectivity in Graphs, Univ. of Toronto Press, 1966, p. 74.
11. N. L. Biggs, Algebraic Graph Theory, Cambridge University Press, London,
1974.
12. A T. Balaban, and D. Farcasiu, J. Amer. Chem. Soc., 1967, 89, 1958.
178 A. T. BALABAN

13. R. M. Roberts and S. G. Brandenberger, J. Amer. Chem. Soc., 1957,79,5484 ;


R. M. Roberts and A. A. Khalaf, Friedel-Crafts Alkylation Chemistry,
A Century of Discovery, Marcel Dekker, New York, 1984, chapter 8, p. 701.
14. A. T. Balaban, M. D. Gheorghiu, A. Schketanz, and A. Necula, J. Amer. Chem.
Soc. 1989, 111, 734 ; A. T. Balaban, Pure Appl. Chem., 1993, 65, 1.
15. A. Barabas and A. T. Balaban, Rev. Roum. Chim., 1974, 19, 1927.
16. A. T. Balaban, Rev. Roum. Chim., 1977, 22, 243.
17. A. T. Balaban, M. Banciu, and V. Ciorba, Annulenes, Benzo-, Hetero-, Homo-
Derivatives and Their Valence Isomers, CRC Press, Boca Raton, Florida,
1986, volume 3, chapter 10, p. 179.
18. M. Randic, Int. J. Ouantum Chem. : Ouantum Chemistry Symp. 1980, 14, 557.
19. W. Stohrer and R. Hoffmann, J. Amer. Chem. Soc.,1966, 88,374.
20. H. Hogeveen and P. W. Kwant, Acc. Chem. Res., 1975,8,413 ; S. Masamune,
M. Sakai and M. Ona, J. Amer. Chem. Soc., 1972,94,8955 ; H. Kollmar,
H. O. Smith and P. v. R. Schleyer, J. Amer. Chem. Soc., 1973,95, 5834 ;
M. J. S. Dewar and R. C. Haddon, J. Amer. Chem. Soc., 1973, 95, 5836.
21. R. E. Leone and P. v. R. Schleyer, Angw. Chem. Internat Ed. Engl.,1970, 9, 860.
22. U. Doring, Math. Chem. (MA TCH), 1975, 1, 151.
22. W. v. E. Doering and W. R. Roth, Tetrahedron, 1963, 19, 715.
23. C. K. Johnson and C. J. Collins, J. Amer. Chem. Soc. 1974,96, 2514.
24. J. C. Barborak and P. v. R. Schleyer, J. Amer. Chem. Soc.,1970, 92,3184.
25. P. Ahlberg, C. Engdahl and G. Honsall, J. Amer. Chem. Soc.,1981 , 103, 1583.
26. M. A. McKervey, Chem. Soc. Revs., 1974,3,479 : R. C. Fort, Jr., Adamantane,
Chemistry of Diamond Molecules, Marcel Dekker; New York, 1976.
27. C. Ganter, in Carbocyclic Cage Compounds (E. Osawa and O. Yonemitsu, eds.),
VCH Publishers, New York, 1992, p. 293.
28. A. T. Balaban and P. v. R. Schleyer, Tetrahedron ,1978, 34, 3599.32.
29. P. v. R. Schleyer in Cage Hydrocarbons (G. A. Olah, Ed.), Wiley, New
York,1990, chapter 1, p. 1.
30. P. v. R. Schleyer, J. Amer. Chem. Soc., 1957,79, 3292.
31. H. W. Whitlock, Jr., and M. W. Siefken, J. Amer. Chem. Soc., 1968,90, 4929.
32. E. M. Engler, M. Farcasiu, A. Sevin, J. M. Cense, and P. v. R. Schleyer, J. Amer.
Chem. Soc.,1973, 95, 5769.
33. E. Osawa, K. Aigami, N. Takaishi, Y. Inamoto, Y. Fujikura, P. v. R. Schleyer,
E. M. Engler, and M. Farcasiu, J. Amer. Chem. Soc., 1977, 99, 5361.
34. S. A. Godleski, P. v. R. Schleyer, E. Osawa, Y. Inamoto, and Y. Fujikura, J. Org.
Chem., 1976, 41, 2596.
35. D. Farcasiu, E. Seppo, M. Kizirian, D, B. Ledlie and A. Sevin, J. A mer. Chem.
Soc., 1989, 111, 8466.
36. W. V. E. Doering and W. R. Roth, Tetrahedron, 1963, 19,715.
37. J. F. M. Oth, K. Mullen, J. M. Gilles, and G. Schroder, Helv. Chim. Acta,
1974, 57, 1415.
38. M. Randic, D. O. Oakland, and J. D. Klein, J. Comput. Chem., 1986, 7, 35.
39. M. H. Klin, S. S. Tratch, and N. S. Zefirov, J. Math. Chem., 1991,7,135.
40. K. Mislow, Acc. Chem. Res.,1976, 9, 26.
41. P. C. Lauterbur and F. Ramirez, J. Amer. Chem. Soc. 1968,90,6722.
42. R. S. Berry, J. Chem. Phys., 1960, 32, 933.
REACTION GRAPHS 179

43. I. Ugi, G. Gokel, P. Gillespie, H. Klusacek, and D. Marquarding, Angew. Chem.,


Internat. Ed. Engl., 1970, 9, 703.
44. I. Ugi, D. Marquarding, H. Klusacek,P. Gillespie, and F. Ramirez, Acc. Chem.
Res., 1971, 4, 288.
45. F. Ramirez and I. Ugi, Adv. Phys. Org. Chem., 1971, 9, 25.
46. J. V. Dunitz and V. Prelog, Angw. Chem. Internat. Ed. Engl.,1968, 7, 725.
47. K. Mislow, Acc. Chem. Res.,1970, 3, 321.
48. G. Zon and K. Mislow, Top. Curro Chern., 1971,19,61.
49. K. E. DeBruin and K. Mislow, J. Amer. Chem. Soc. 1969,941, 7393.
50. R. Luckenbach, Dynamic Stereochemistry of Phosphorus and Related
Elements, Georg Thieme Verlag, Stuttgart, 1973, p. 214.
51. E. L. Muetterties, J. Amer. Chern. Soc. 1968, 90, 5097 ; 1969, 91, 1115 ;
1969,91,1636.
52. E. L. Muetterties and C. M. Wright, Quart. Rev., 1967, 21, 179.
53. E. L. Muetterties, Acc. Chem. Res., 1970, 3, 266.
54. J. I. Musher, J. Chern. Educ., 1974, 51, 94.
55. J. I. Musher, J. Amer. Chem. Soc. 1972, 94, 5662.
56. M. Gielen and J. Nasielski, BUll. Soc. Chim. Belges, 1969, 78, 339.
57. J. Nasielski, Pure Appl. Chem., 1972, 30, 449.
58. J. Brocas, Top. Curro Chem., 1972,32, 43.
59. M. Gielen, Acc. Chem. Res., 1973,6, 198.
60. M. Gielen, Bull. Soc. Chim. Belges, 1969,78, 351.
61. D. C. Garwood and D. C. Cram, J. Amer. Chem. Soc. 1970, 92, 1575.
62. D. J. Cram and J. M. Cram, Fortschr. Chern Forsch.,1972, 31, 1.
63. A. T. Balaban, Rev. Roum. Chim., 1973, 18, 855.
64. L. M. Jackman and F. A. Cotton, in Dynamic Nuclear Magnetic Resonance
Spectroscopy, Academic Press, New York, 1975, p. 494.
65. A. T. Balaban, Rev. Roum. Chim., 1978, 23, 733.
66. E. Ruch and W. Hasselbarth, Theor. Chim. Acta, 1973, 29, 259.
67. M. Randic, M. Katovic, and N. Trinajstic, in Proceedings of an International
Symposium, Paris, 1-7 JuIY,1982, Studies in Physical and Theoretical
Chemistry, Vol. 23 (J. Maruani and J. Serre, eds.), Elsevier Scientific
Publishing Company, 1983, p.399.
68. R. B. King, J. Math. Chern., 1991,7,51.
69. J. C. Bailar, Jr., J. Inorg. Nucl. Chern., 1958, 8, 165.
70. P. Ray and N. K. Dutt, J. Indian Chern. Soc., 1943,20,81.
71. A. T. Balaban, Rev. Roum. Chim., 1973,18,841.
72. G. A. Jones and E. K. Lloyd, in Chemical Applications of Topology and Graph
Theory (R. B. King, ed.), Studies in Physical and Theoretical Chemistry, vol.
28, Elsevier, Amsterdam, 1983.
73. M. Randic and M. I. Davis, Int. J. Quantum Chern., 1984, 26, 69.
74. M. Randic, J. D. Klein, V. Katovic, D. O. Oakland, W. A. Seitz, and
A. T. Balaban, in Graph Theory and Topology in Chemistry (R. B.King and
D. H. Rouvray, eds.), Studies in Physical and Theoretical Chemistry, Vol. 51,
Elsevier, Amsterdam, 1987, p.266.
75. R. B. King, J. Mol. Struct. (Theochem) 1989,185, 15.
76. J. Brocas, J. Math. Chern., (in press).
180 A. T. BALABAN

77. R. B. King, Theor. Chem. Acta, 1987, 64, 439.


78. R. B. King, Applications of Graph Theory and Topology in Inorganic Cluster
and Coordination Compounds, CRC Press, Boca Raton, Florida, 1993, p. 193.
79. A. T. Balaban and J. Brocas, J. Mol. Struct. (Theochem) 1989,185, 139.
80. D. Bonchev, D. Kamenski and O. N. Temkin, J. Math. Chem., 1987, 1, 345.
81. D. Bonchev, O. N. Temkin, and D. Kamenski, J. Compuf. Chem., 1982,3,95.
82. O. N. Temkin, and D. Bonchev, J. Chem. Educ.,1992, 69. 544.
83. D. Bonchev, in Chemical Graph Theory, Reactivity and Kinetics (D. Bonchev
and D. H. Rouvray, eds), Abacus Press-Gordon and Breach, Philadelphia, 1992,
p.1.
84. K. B. Yatsimirskii, Internat. Chem. Eng., 1975, 5, 7 ; z. Chem., 1973, 13, 201.
85. J. B. Hendrickson, Top. Curro Chem., 1976, 62, 49 ; 86.
86. J. B. Hendrickson, J. Amer. Chem. Soc., 1977, 99, 5439 and previous parts in
the series.
87. J. B. Hendrickson and E. Braun-Keller, J. Comput. Chem., 1980, 1, 323.
88. E. J. Corey and W. T. Wipke, Science, 1969, 166, 178.
89. E. J. Corey, Quart. Revs., 1971, 25, 455.
90. E. J. Corey, A. P. Johnson, and A. K. Long, J. Org. Chem., 1980,45, 2051.
91. A. T. Balaban, Math. Chem. (MATCH), 1980, 8, 159 and further references
therein.
92. M. Bersohn and A. Esack, Chem. Rev., 1976, 76, 269.
93. R. Barone and M. Chanon, in Computer Aids to Chemistry (G. Vernin and
M. Chanon eds.), Wiley, New York, 1986, p. 19.
94. T. Dugundji and I. Ugi, Top. Curr, Chem., 1973, 39, 19.
95. N. S. Zefirov, Acc. Chem. Res., 1987, 20, 237.
96. V. Kvasnicka and J. Pospichal, J. Math. Chem., 1989,3, 161.
97. J. Koka, M. Kratochvil, V. Kvasnicka, L. Matyska, and J. Pospichal, A Synthon
Model of Organic Chemistry and Synthesis Design, Lecture Notes in
Chemistry , Vol. 151, Springer, Berlin, 1989.
98. V. Kvasnicka, Coli. Czech. Chem. Comm., 1984,49, 1090 and previous papers
in the series.
99. M. Johnson, in Graph Theory, Combinatorics, and Applications (Y. Alavi, G.
Chartrand, O. R. Ollerman, and A. J. Schwenk), Wiley, New York, 1991, p. 725.
100. M. Johnson, in Graph Theory and its Application to Algorithm and Computer
Science, (Y. Alavi, G. Chartrand, L.Lesniak and C. Wall, eds.), Wiley,
New York, 1985, p. 457.
101. A. T. Balaban, J. Chem. Inf. Comput. Sci., 1985, 25, 334.
102. J. Dugundji, P. Gillespie, D. Marquarding, I. Ugi and F. Ramirez, in Chemical
Applications of Graph Theory (A. T. Balaban, ed.), Academic Press, London,
1976, p. 107.
103. D. H. Rouvray and A. T. Balaban, in Applications of Graph Theory
(R. J. Wilson and L. W. Beineke, eds.), Academic Press, London, 1979, p.177.
DISCRETE REPRESENTATIONS OF
THREE-DIMENSIONAL MOLECULAR BODIES AND
THEIR SHAPE CHANGES IN CHEMICAL REACTIONS

PAUL G. MEZEY
Mathematical Chemistry Research Unit
Department of Chemistry and
Department of Mathematics and Statistics
University of Saskatchewan
Saskatoon, Canada, S7N OWO

1. Introduction and review of basic topological concepts of molecular


shape representation
The simple, intuitive concepts of a formal molecular body and molecular surface are
very useful for the interpretation of molecular size and shape properties, various
molecular interactions, and in particular, chemical reactions, when using approximate
models. Evidently, molecules as three-dimensional objects, do occupy some space. By
analogy with macroscopic objects, it is natural to associate with molecules a formal
molecular body and a formal molecular sUrface. Within models that take into account
the space requirements of molecules, a molecular surface is regarded as a formal
molecular boundary that separates the three-dimensional (3D) space into two parts. The
part of space enclosed by the surface is regarded as a molecular body that is assumed to
represent the entire molecule.

However, the above simple picture is very misleading if taken at face value. Molecules
are quantum mechanical objects. They have neither a finite body defined in precise
geometrical terms nor a finite boundary surface that encloses the entire electron density
of the molecule. The electronic density function changes rapidly with distance within a
certain range, but this change is continuous; there is no abrupt change analogous to the
boundary of a macroscopic object like a potato. The cloud-like, fuzzy electronic density
of a molecule is rather different from a macroscopic body, and no finite distance can be
specified beyond which the electronic density of the molecule is precisely equal to zero.
Even within a semiclassical model, the electronic density decreases in a continuous
manner as the distance from the nuclei increases, and no precise distance can be given
181
D. Bonchev and O. Mekenyan (eds.), Graph Theoretical Approaches to Chemical Reactivity, 181-208.
© 1994 Kluwer Academic Publishers.
182 P.G.MEZEY

where the molecule ends. No molecular surface exists in the classical, macroscopic
sense. The peripheral regions of a molecule are described by a continuous, 3D
electronic charge density function that approaches zero value at large distances from the
nuclei of the molecule.

In a less rigorous sense, however, the concepts of a formal molecular body and
molecular surface are very useful within both classical and approximate quantum
chemical models of molecules. A boundary surface of a formal molecular body, or
several such surfaces at various electron density thresholds can be defined by requiring
only that these surfaces enclose an essential part of the molecule. Depending on the
chemical problem, there are many possible choices for what is to be regarded as the
essential part of the molecule; some of the possible choices, as well as many of the
topological shape analysis techniques are described in ref. [1]. Usually, some electron
density threshold is chosen, and all points where the electronic density is equal to or
greater than this threshold are regarded to belong to the "essential part" of the
molecule. This approach can be formulated in terms of contour surfaces of electronic
charge densities, called molecular isodensity contours (MIDCO's).

Alternative choices can also be considered as formal molecular surfaces, for example,
contours of the molecular electrostatic potential (molecular electrostatic potential
contours, MEPCO's), the contours of specified molecular orbitals, such as frontier
orbitals, or simple molecular Van der Waals surfaces generated by fused atomic spheres
(VDWS'), solvent accessible surfaces, and various other surfaces surrounding some or
all of the nuclei of a molecule. For each choice, the part of the 3D space that is
enclosed by the surface can be regarded as an approximate molecular body; for
example, a formal molecular body B(a) can be taken as the part of three-dimensional
space enclosed by a MIDCO G(a) of a density threshold a.

Molecules of similar functional groups, similar moieties, and molecules having similar
reactions usually have similar molecular isodensity contours, MID CO's, and similar
molecular electrostatic potential contours, MEPCO's. The similarities are often
restricted to local regions, nevertheless, they are recognizable. The fact that similarities
in reactivity are often associated with local similarities in the MID CO and MEPCO
surfaces of the reacting molecules indicates the special importance of electronic and
electrostatic interactions during the initial stages of chemical reactions. Biomolecules
and drug molecules of similar biochemical effects also show similarities in their
MEPCO's and MIDCO's. Since the shapes of MEPCO's often suggest mechanistic
explanations of how a given ligand interacts with a receptor site of an enzyme, the
study of the three-dimensional shapes of MEPCO's is an important task in rational
drug design.

There are several important differences between MIDCO's and MEPCO's; one of the
most prominent differences is in the range for the numerical values of the contour
thresholds. Whereas the threshold parameters a of electronic charge density contours,
MIDCO's G(a) can take only non-negative values, the threshold parameter of a
molecular electrostatic potential contour MEPCO G(a) can take both positive and
negative values.
THREE-DIMENSIONAL MOLECULAR BODIES AND THEIR SHAPE CHANGES 183

The conventional approach of direct visual inspection of shapes is one of the simplest
methods for the assessment of shape similarity. Most molecular modeling computer
programs have the capacity to display pictorial shape information on the computer
screen; these models can then be used for a similarity assessment by inspection. Using
advanced quantum chemistry, molecular mechanics, and other computational chemistry
methods and computer graphics techniques, the three-dimensional images of various
molecular models, contour surfaces, or macromolecular representations such as a
pattern of a folded protein chain or a knotted structure of a DNA fragment can be
displayed on the computer screen. These images can be rotated, aligned with other
models, or superimposed on models of other molecules, and simple visual comparisons
can be used to judge molecular similarity. Such visual comparisons are much enhanced
by the chemical knowledge and expectation of the observer, who can incorporate in the
assessment the known or assumed relative importance of various shape features seen
on the computer screen.

However, such visual comparisons have some disadvantages: they are often subjective,
seldom reproducible, and rely on the visual memory of the observer. It is hard to
remember and compare images seen hours ago with one currently on display. Two
different observers are likely to judge the similarity of a sequence of molecular models
differently: when the task is to order a set of molecules according to their degree of
similarity to a target molecule, two different observers are likely to come up with
different orders.
These potentially serious disadvantages of visual similarity evaluation methods can be
circumvented by nonvisual, algorithmic similarity analysis, using automated similarity
assessment by computer. It is possible to use nonvisual computational techniques for
evaluating the degree of similarity by reproducible, algorithmic methods. By such
computer-based algorithmic methods, molecular similarity can be assessed and
evaluated numerically.

Algorithmic similarity analysis, the determination, evaluation, and comparison of


shapes are important problems in many fields of natural sciences. If the full wealth of
the detailed geometrical information of a complicated object is considered, then many of
the relevant aspects of shape are not easily representable numerically. However,
topological methods can considerably simplify the problem by focusing on the essential
features and by describing them in terms of topological invariants. Many topological
shape analysis methods follow some of the fundamental steps of visual inspection
methods, however, the algorithmic computational methods of similarity assessment
based on topological techniques are systematic, mathematically precise, objective, and
reproducible.

Some techniques of shape analysis are based on generalizations of the concept of


convexity [1]. A molecular body B(a) is a convex subset in the three-dimensional
space, if for any two points rl and r2 of B(a) all points r of the straight line
segment between rl and r2 also fall within the body B(a). This is a global
condition for ordinary convexity of a formal molecular body B(a). Most globally
convex molecular bodies B(a) represent chemically rather uninteresting electron
distributions; globally convex B(a) bodies are either those of single atoms or those
184 P. G. MEZEY

enclosed by the large, low electron density MIDCO's of molecules. By contrast, most
of the nonconvex formal molecular bodies and the associated MIDCO's of more
intricate shape features are of more chemical interest. For a detailed study of their
shapes the methods of local convexity analysis and their various generalizations are
important mathematical, algorithmic tools. Some of these generalizations will be
reviewed below, with special emphasis on the topological techniques adapted to treat
fuzzy electronic charge clouds.

A single surface or a single, formal molecular body cannot provide a detailed enough
description of the shape of the actual, fuzzy electron distribution or the entire, 3D
molecular electrostatic potential of any given molecule. Individual surfaces, that is,
individual geometrical models are insufficient for the description of the shape of
molecules, even if one takes a static model and ignores vibrational motions and
conformational flexibility. The validity of simple geometrical models is even more
restricted if the conformational flexibility and the more general, dynamic molecular
properties of molecules are considered.

To overcome this problem, one may consider a whole continuum of a family of


molecular surfaces, which can collectively represent the shape of the molecule. Within
this family one may search for the common and most essential features. Using this
approach, the shape analysis problem of fuzzy bodies of electron distributions of
molecules can be formulated as the shape analysis problem of a family of formal
molecular surfaces.

A practical implementation of this idea is suggested by topology, suitable for the


detailed analysis of the dominant shape features. These methods require the tools of
three-dimensional topology. Topology can help in several ways: in the precise
formulation of the molecular shape problem, and also in the actual design of
computational methods for the evaluation of shape descriptors and numerical similarity
measures. The homology and homotopy groups of algebraic topology provide efficient
techniques for the shape analysis of individual surfaces. These groups also can serve as
tools suitable to extract the dominant, common features from an entire family of such
surfaces [1].

A general topological approach to molecular shape analysis involves two distinct


stages. The primary objects are the geometrical models of individual molecular surfaces
and the molecular bodies enclosed by them. In the second stage, these objects are used
to derive the topological models describing the common topological properties of
families of contour surfaces or the associated bodies. These topological properties are
defined, for example, by intervals of possible curvature values or a range of electron
density contour values.

In fact, most topological shape analysis methods for molecules is based on the two-step
process of Geometrical Classification and Topological Characterization [1].

The related principle of Geometrical Similarity as Topological Equivalence, referred to


as the GSTE principle, is a common, fundamental aspect of most of these shape
THREE-DIMENSIONAL MOLECULAR BODIES AND THEIR SHAPE CHANGES 185

analysis techniques [1]. Geometrical conditions are used to defme ranges of geometrical
objects (e.g., families of points along a MIDCO where the contour surface satisfies
some local geometrical condition, for example, some curvature condition), leading to a
geometrical classification of the surface points into domains. This step is followed by a
topological characterization of the various interrelations among these domains. This
topological characterization is stable for most small geometrical variations that are
regarded incidental. In fact, topology is used to extract the essential information,
affected only by greater geometry changes.

In order to apply the above principles, the characterization of the shapes of molecular
contour surfaces, such as MIDCO's and MEPCO's, requires the subdivision of the
surface into domains fulfilling some local shape criteria. One may consider two types of
criteria: relative, and absolute shape criteria. These lead to a relative shape domain
or an absolute shape domain subdivision of the molecular contour surface,
respectively.

If one compares two or several surfaces to one another by some direct technique, then
relative shape conditions are used. In one implementation, a pair of contour surfaces of
two molecules or contour surfaces of two different physical properties for the same
molecule are superimposed, generating interpenetration patterns on these surfaces. The
maximum connected subsets of these patterns can be taken as the relative shape
domains on each surface, and the interrelations among these relative shape domains in
the patterns can be used as criteria for local shape characterization. This shape analysis
belongs to the family of relative methods, describing the shape of one molecule relative
to another or one physical property of a molecule relative to another property of the
same molecule. When a topological analysis and subsequent characterization are applied
to these relative shape domains, then a direct comparison is obtained between the two
molecular surfaces.

It is not always necessary to compute the entire contour surface for both molecules or
for both physical properties; for a simple study, it is sufficient to map the
interpenetration pattern on one of these surfaces. For example, if the technique of
interpenetrating contour surfaces is applied for a relative shape domain subdivision of a
pair of MIDCO and MEPCO surfaces of the same molecule, then the MID CO
surface can be subdivided into domains using the contour value of the MEPCO as
criterion. This procedure is equivalent to generating the interpenetration pattern on the
MIDCO surface [1].

If one compares a molecular contour surface to some standard surface, such as a plane,
or a sphere, or an ellipsoid, or any other closed surface selected as standard, then an
absolute shape characterization is obtained. In the simplest case, a contour surface is
compared to a plane. This plane can be moved along the contour as a tangent plane, and
the local curvature properties of the molecular surface can be compared to the plane.
Each point of the contour surface can be characterized by the local relation between the
tangent plane and the body enclosed by the surface: the tangent plane may fall on the
outside, on the inside, or it may cut into the given molecular body within any small
neighborhood of the surface point. This characterization leads to a subdivision of the
molecular contour surface into locally convex, locally concave, and locally saddle-type
186 P. G. MEZEY

shape domains, usually denoted by the symbols D2, Do, and Db respectively. These
shape domains are absolute and lead to an absolute shape characterization in the above
sense, since the local curvatures are compared to the plane, selected as standard. If a
different standard, such as a tangent sphere or a tangent ellipsoid is selected then a
similar technique can be used for generating absolute shape domains on the contour
surface.
When a topological analysis and subsequent characterization are applied to these
absolute shape domains, then an absolute shape characterization of the molecular
surface is obtained.
Some of the topological methods of shape analysis of molecular contour surfaces,
designed to take advantage of such relative and absolute shape domain subdivisions of
the contours according to some physical or geometrical conditions are described in
detail in ref [l].

The justification of the use of shape domains of contour surfaces for facilitating a
topological shape analysis of molecules follows from a simple observation. Most
commonly, molecular contour surfaces defmed in terms of various physical properties,
such as the MIDCO or MEPCO surfaces, are topologically rather simple, even trivial
objects. Most MIDCO's in the chemically interesting density ranges are topologically
equivalent to a sphere, or in more special cases to a doughnut or to a few "fused"
doughnuts. A direct topological characterization of such (topologically simple) surfaces
does not provide much useful information about their chemically interesting shape
features, and is of little use in a similarity analysis.

To overcome this problem, one may use various geometrical or physical conditions,
denoted in general by ~, and define various domains on a contour surface G(a), such
as the shape domains Dji. of various curvature types. These domains subdivide the
contour surface G(a). One may formally cut out from G(a) all domains of some
specified properties [for example, all the locally convex domains ("bumps") of the
contour surface G(a)]. In this process, a new, topologically more interesting object, a
truncated contour surface G(a,~) is obtained, that inherits some shape information of
the original surface in a topologically easily accessible way.

The new, truncated surface G(a,~) is no longer topologically equivalent to the original
contour surface G(a). In spite of this, the surface G(a,~) carries information on the
shape of the original surface G(a), where shape is understood within the context of the
curvature condition or physical property used to define domains on the surface. In this
scheme, a geometrical or physical shape condition is used to convert the original
molecular surface G(a) into a new, topologically different object G(a,~). The
topological analysis of the truncated surface G(a,~) corresponds to a shape analysis
of the original surface G(a). The topological invariants of the truncated surface G(a,~)
contain information on the pattern and topological interrelations of various shape
domains on the original contour surface G(a). In fact, the geometrical curvature
properties of the surface, considered as a manifestation of the shape of the object, are
characterized by topological tools.
THREE-DIMENSIONAL MOLECULAR BODIES AND THEiR SHAPE CHANGES 187

The GSTE principle (Geometrical Similarity as Topological Equivalence) applies to


the analysis of these shape domains and their interrelations. Most of the related shape
analysis methods have the following common features:

(i) all the possible (in principle, infinitely many) geometrical patterns and
arrangements are classified by some combination of geometrical and
topological criteria, and

(ii) the resulting classes are characterized by topological means.

Consider a chemical reaction or a conformational change of a molecule. The associated


changes in the nuclear geometry of the molecule are likely to alter the size, the location,
and perhaps the very existence of some of the shape domains (for example, the local
curvature domains) on a MIDCO surface. Nevertheless, for most small changes of the
nuclear arrangement, for example, for a small progress along a formal reaction
coordinate, the existence and mutual neighbor relations of these shape domains remain
invariant. There is some range of geometrical arrangements along a segment of the
reaction path where a common topological pattern of shape domains can be found on
the corresponding MIDCO surfaces. These infinitely many geometrical arrangements
within this range, all having the same topological pattern, are regarded to belong to a
single class. For such a class of infmitely many different geometrical arrangements of
the nuclei, the classification can be based on a geometrical condition of certain bounds
of the local curvature of the surface. In addition, one can use the topological condition
of having a certain pattern of neighbor relations of the various curvature domains of the
MIDCO surface.

Such a combination of geometrical and topological conditions is in fact a shape


condition. The entire class of infmitely many nuclear arrangements satisfying this shape
condition on the associated MIDCO's is characterized topologically by having a
common pattern of shape domains. Within this class of nuclear arrangements the
topological properties of the actual geometrical patterns remain invariant. All nuclear
arrangements of the molecule having the same topological patterns among the MIDCO
shape domains can be characterized by the same topological invariants as their formal
shape descriptors. An abstract topological object can be associated with the entire class
of geometrical arrangements, and the characterization of the class is given by the
topological properties of the abstract object.

This method follows the framework of the GSTE principle: the initial geometrical
classification by curvature properties leads to an eventual topological characterization. If
the above topological characterization gives the same results for two different nuclear
arrangements along the reaction path, that is, if the two patterns are topologically
equivalent, then the two arrangements are similar in a geometrical sense.

One can take the same approach when comparing two different molecules: if their shape
domain characterizations give equivalent topological results, then the two molecules are
similar in a geometrical sense. The geometrical similarities of the nuclear arrangements
within a range of the nuclear configurations along the reaction path are manifested in a
topological equivalence of the shape domain patterns of the corresponding sequence of
188 P.G. MEZEY

their MIDCO surfaces. Indeed, geometrical similarity corresponds to a topological


equivalence [1].

One topological technique for a concise representation of the interrelations among the
shape domains is the Shape Group Method (SGM) [1], developed for the analysis of
molecular shapes. Assume that a family of shape domains is determined on a molecular
contour surface G(a). A truncated contour surface G(a,ll) is obtained from G(a) by
excising a selected subfamily of shape domains, for example, all Dll shape domains
for a specified index ll. The shape groups of the contour surface G(a), with respect
to the given family II of shape domains, are the homology groups of the truncated
contour surface G(a,ll).
For example, we may assume that the shape domains are defined in terms of local
convexity, leading to locally concave, saddle type and convex domains. If we select the
locally convex domains (that is, the domains of index II =2), then the shape groups of
G(a) are the homology groups [1] of the truncated isodensity contour surface G(a,2),
obtained from the molecular contour surface G(a) by cutting out all Dll domains of
index II = 2.
Homology groups of algebraic topology are topological invariants, expressing an
important aspect of the topological structure of bodies and surfaces. In general, the
shape groups of an object are the homology groups of a truncated object derived from
the original object by eliminating parts fulfilling some physical, or geometrical or some
other topological criteria [1]. This family of shape groups of the example, obtained by
eliminating all locally convex domains of G(a), has been used for the shape analysis of
many molecules [1].

Following the spirit of the GSTE principle, the above shape group methods combine
the advantages of geometry and topology. The shape domains and the subsequent
truncation of the MIDCO's are defined in terms of a geometrical classification of
points of the surfaces, using local curvature properties, whereas the truncated surfaces
are characterized topologically by the shape groups.
The shape groups are not determined by the point symmetry groups of the nuclear
framework. The shape groups provide a symmetry-independent characterization of
molecular shape.

The method of molecular contour surface characterization in terms of curvature


properties can be refined by testing the local curvatures of a contour surface G(a)
against a tangent sphere, or an oriented tangent ellipsoid T [1]. The latter method
takes into account orientation with respect to external directions, hence it is suitable for
molecular shape analysis in external electromagnetic fields or with respect to a direction
defined by another molecule. We assume that the relative orientation of the reference
ellipsoid T and contour surface G(a) is fixed. The ellipsoid T of axes of fixed
orientation can be brought into tangential contact with G(a), at any point r of the
surface G(a), by applying a suitable translation of T. At each point r, the G(a)
surface is regarded locally as a function of "elevation" over the tangent sphere or
tangent ellipsoid T. The second derivatives of this function of elevation define a
THREE-DIMENSIONAL MOLECULAR BODIES AND THEIR SHAPE CHANGES 189

Hessian matrix lHlT(r), and points r of G(a) are classified into domains
according to the relative local convexity properties of their neighborhoods on G(a),
relative to the tangent sphere, or according to the oriented relative local convexity
propenies, relative to the tangent ellipsoid T. Points with zero, one and two negative
eigenvalues belong to domains Dom, D 1m, and D2(T)' which are concave, of the
saddle-type, and convex, respectively, relative to the tangent sphere or the oriented
tangent ellipsoid T.
The test ellipsoid T may be oriented so as to represent an external electromagnetic field.
Alternative choices for orientation include the main direction of a cavity of an enzyme
molecule, the axes of a polarizability ellipsoid of a molecule, an alignment on the
surface of a catalyst, or some other internal or external constraint [1].

If one replaces the ellipsoid T by any other differentiable surface, for example, by a
contour surface of another molecule [1], then a further generalization of the concept of
convexity is obtained. The resulting shape domains on the MIDCO G(a) can be used
for a direct shape comparison and a direct similarity test for these molecules.

The above shape analysis methods all rely on a classification of surface domains based
on local curvature properties. Curvature can be regarded as the second derivative of a
function describing the surface, hence all curvature-based methods of shape analysis
require that the surface must be twice differentiable. However, not all models of formal
molecular surfaces fulfill this criterion: some surfaces, such as fused sphere Van der
Waals surfaces don't even possess first derivatives at the seams of interpenetrating
spheres.

A similar problem arises with simple models of solvent accessible surfaces, obtained by
"rolling" a sphere (representing the solvent molecule) along a formal molecular surface
of the solute. Even if the latter surface is differentiable, the solvent accessible surface,
taken as the surface generated by motion of the center of the rolling sphere, is not
necessarily differentiable [1].

Several methods of shape analysis have been proposed for such surfaces [1]. One of
these methods is applicable for surfaces that are not everywhere differentiable as well as
for the shape characterization of dot representations of molecular surfaces. The latter
models are not only nondifferentiable, but are not even continuous. This technique, the
method of T-hul/s [1] is based on a generalization of the concept of convex hull. The
convex hull of a set A is usually defined as the smallest convex set that contains A.
However, an alternative, equivalent definition lends itself to a useful generalization.
Take all possible half spaces (a half space is one side of the three-dimensional space
divided by a plane) that contain the set A, and define the convex hull of A as the
intersection of all these half spaces. A generalization is obtained if one replaces the half
spaces by some other object T. Consider a three-dimensional body T. By definition,
the T-hull of a point set A is the intersection of all rotated and translated versions of
T which contain A.

The T -hull method can be applied for shape comparisons of molecules using a
common reference shape, chosen as that of the body T. In another application, the
190 P. G. MEZEY

shapes of two molecules, T and A can be compared directly. One of the two
molecular bodies can be chosen as T and the T -hull of the other molecular body A
serves as a tool for a direct shape comparison [1]. The smaller the deviation between
A and the T-hull of A (as measured, for example, by volume differences), the more
similar are T and A.

Another important class of methods used for molecular similarity analysis is based on
the resolution based similarity measures (RBSM) [1]. The principle of these methods
can be illustrated by a simple example. Consider three objects, A, B, and C, which
appear indistinguishable at a great distance to an observer. For example, for a distant
observer all three objects may appear as mere points, hence the objects cannot be
distinguished. At some closer distance, one of the objects, say object C is already
distinguishable from objects A and B, but the latter two may still appear
indistinguishable. At a close distance, A and B are also distinguishable.

Alternatively, one may view a low resolution photograph of these three objects; if the
resolution is very low, the three objects may appear indistinguishable. One may view a
somewhat better resolved photograph, where object C is distinguishable from the
other two, while A and B are still indistinguishable. However, on a photograph of
high enough resolution, all three objects are distinguishable. Using either approach, it
is natural to conclude that A and B are more similar to each other than A is to C or
B is to C, simply, because the dissimilarity of C from the other two objects is
already evident at a medium distance or at medium resolution, whereas it takes a closer
look or a much higher resolution to distinguish A from B.

In general, the observer may also use a series of binoculars, and in order to distinguish
objects, a higher level of resolution of the observed picture is required if the objects are
more similar. Based on this idea, one can define a similarity measure relying on the
level of resolution required to distinguish objects, leading to Resolution Based
Similarity Measures (RBSM's), described in detail in [1].

It is a relatively simple task to characterize resolutions numerically, and the problem of


measuring similarity numerically can be reduced to measuring resolution numerically. A
numerical similarity measure can be dermed in terms of the resolution of point distances
required to distinguish objects. If two objects are very dissimilar, they are
distinguishable at a low resolution; if they are more similar, then a higher resolution is
required to distinguish them; and if the shapes of the two objects are identical, then they
are indistinguishable even at infinite resolution.

The concepts of resolution and distinguishability can also be approached within a


general topological framework. Take a family of objects and assume that a common
hierarchy of topologies is defined for each object, where the hierarchy is ordered by the
finer-cruder relations of the topologies (see ref. [1]). Within a topological framework,
considering finer topologies is analogous to considering higher levels of resolution.

As an example, we may consider a family of MIDCO surfaces, and a hierarchy of


several shape domain partitionings for each surface. The set of shape domains can be
regarded as a defining subbase for a topology on the MIDCO for each shape domain
THREE-DIMENSIONAL MOLECULAR BODIES AND THEIR SHAPE CHANGES 191

partitioning, that is, for each level of topological resolution. For each such level, the
introduction of a topology turns the MIDCO into a topological space. The
corresponding topologies are related by a cruder-finer relation [1], if the shape
domains of a cruder partitioning can be constructed as unions of the shape domains of a
finer partitioning. If all shape domain partitionings, that is, all the topologies introduced
on the MIDCO's can be ordered by such relations, then the corresponding hierarchy
of topologies also provides a hierarchy of topological resolutions. If two MIDCO's
have identical shape groups using a finer shape domain partitioning than two MIDCO's
which have different shape groups already at a cruder shape domain partitioning, then
the members of the first pair of MIDCO's are more similar to each other than the
MIDCO's of the second pair. The complexity of the partitioning can be characterized
by the defining subbase used, that gives a measure of how fine the corresponding
topology is. In tum, this measure gives a resolution based measure of similarity of the
MIDCO surfaces.

The quantum mechanical uncertainty in nuclear positions and the associated, inherent
vibrational and other internal motions of molecules imply that molecular shapes cannot
be described in detail without taking into account dynamic features of molecules. The
topological aspects of dynamic shape properties have an important role in the study of
conformational motions, chemical reactions, as well as electronic excitations.
Evidently, molecular shape is not a static property. Molecules vibrate even at absolute
zero temperature. According to quantum mechanics, the formal vibrational properties
are manifested in a probabilistic distribution of nuclear positions in any poly atomic
molecule. In a similar manner, rotational states of molecules also influence their shapes.
Since motion is an inherent property of all molecules, molecular shapes cannot be
described in detail without taking into account the dynamic aspects of the motion of
various parts of the molecules relative to one another.

Using a semiclassical approximation, the dynamic shape variations during vibrational


and rotational motions, as well as in conformational changes and chemical reactions,
can be modeled by an infinite family of geometrical arrangements. If more energy is
available, that is, at higher temperatures, the molecular vibrations may cover a wider
range of formal molecular geometries. Consequently, a greater variety of dynamic
shapes may occur. There are significant changes in the conformational freedom of
molecules at various temperatures, implying a strong temperature dependence of
molecular shapes. For a "cold" molecule with energy near the zero-point energy
associated with the various vibrational modes, only a limited variation of possible
nuclear arrangements (also called nuclear configurations) can occur with significant
probability. As a consequence, the dynamic shape of the molecule is strongly
constrained. However, if the molecule has energy much above its zero-point energy,
then a much larger family of possible nuclear configurations can be accessed with
significant probability. Consequently, the dynamic shape of the molecule is less
restricted. If the energy is further increased, to levels sufficient for overcoming the
activation barriers to formal conformational rearrangements, a further increase in the
extent of shape variations can be found. At even higher energies, formal bonds can be
broken or created, and chemical reactions can occur, accompanied by significant
changes in molecular shapes. In all these processes, the dynamic shape of molecules is
an energy-dependent property.
192 P.G.MEZEY

The energy dependence of the accessible shapes and the accessible symmetries of
various molecules obey a family of rules influencing the mechanism and outcome of
conformational changes, electronic excitations, and chemical reactions [1].

Some of the dynamic shape analysis aproaches can be formulated in terms of the
dynamic shape space D, defined as a composition of the nuclear configuration space
M, and the space of the parameters involved in the shape representation, for example,
the two-dimensional parameter space defined by the possible values of the density
threshold a, and some reference curvature parameter b of a given MIDCO surface.

Two types of methods for dynamic shape analysis have been distinguished [1]. The
methods belonging to the first class are used to determine which nuclear arrangements
are associated with a given topological shape. The methods belonging to the second
class determine the available topological shapes compatible with some external
conditions, for example, with an energy bound.

A simple formulation of the dynamic shape analysis methods of the first type can be
given in terms of the invariance of topological descriptors within domains of the
dynamic shape space D. The subsets of the dynamic shape space with a common shape
group, called the shape group invariance domains 0/ D, can serve as tools for such
dynamic shape analysis. Within these invariance domains a limited change of nuclear
configurations, hence a limited change in the geometrical shape of the MIDCO surface
is permitted, as long as these changes are small enough so that within the given
topological context the topological shape remains invariant. For example, the
preservation of the shape group is a suitable topological criterion. The dynamic shape
space invariance domains serve as tools for analyzing dynamic shape properties.

An upper limit for energy within a family of nuclear arrangements can be selected as
criterion to be used in a dynamic shape analysis method of the second type. The task is
to determine the invariance domains of topological descriptors in the dynamic shape
space D, restricted to these nuclear configurations. In this process, an
energy-dependent family of allowed shapes is obtained, as defined by the given
topological descriptors.

Alternatively, the energy criterion can be replaced with formal temperature, using
properties of Boltzmann distributions. At a higher temperature more energy is available,
and the molecular vibrations cover a wider range of formal molecular geometries. A
larger accessible conformational domain implies that a greater variety of dynamic
shapes is likely to occur. At some higher temperature, the energy is sufficient for
overcoming the activation barriers to conformational rearrangements or to chemical
reactions. Consequently, at these temperatures a further, more significant increase in
the size of accessible configuration space domains is found, and greater shape
variations are expected.

Electronic excitations also have important influence on molecular shapes. Typically, a


closed shell molecule with a singlet ground state electronic configuration has an
electronic density of shape different from any of the excited electronic states of the same
THREE-DIMENSIONAL MOLECULAR BODIES AND THEIR SHAPE CHANGES 193

molecule. The configuration space shape invariance domains of potential energy


surfaces of different electronic states are usually rather different. The minimum energy
points may occur at different nuclear arrangements, and the point group symmetries of
the most stable nuclear arrangements along different potential surfaces are often
different. It is important to note that, even if the point group symmetries are the same,
the actual shapes of electron distributions are likely to be different_

2. Molecular shape representation by nuclear potential contours


(NUPCO's)
Molecular shape is often represented by the electronic density, in particular, by families
of molecular isodensity contours (MIDCO's) calculated for selected threshold density
values a. Some of the elementary properties of MIDCO's and the topological
techniques for nonvisual, algorithmic shape analysis of calculated electron densities
have been discussed in detail in reference [1], with special emphasis on the density
domain approach (DDA) to chemical bonding, and on the quantum chemical
definition of functional groups of chemistry.

Whereas these techniques are useful for the study of many chemically relevant aspects
of molecular shape, they require rather time consuming electron density calculations,
and hence are not easily adaptable for large molecules. There is a need for alternative,
approximate techniques.
An early observation of Parr and Berk [2] is the basis for a simple, discrete
representation of molecular shapes and shape changes in chemical reactions, described
below. The method proposed in the present study is much simpler than the original
shape group analysis applied directly to MIDCO's, nevertheless, the approach retains
many of the fundamental features of MIDCO-based molecular shape analysis and
molecular similarity analysis techniques.
The potential V n(r) generated by the nuclei, also called the "bare nuclear
potential" by Parr and Berk [2], is defined as

Vn(r) = ~ Zi/lr - Ril, (1)


1

where Vn(r) generated at point r is calculated from the nuclear charges Zi of


positions defined by the formal nuclear position vectors Ri'

Parr and Berk [2] have found that the isopotential contours of the nuclear potential
Vn(r) of simple molecules show a remarkable similarity to the actual MIDCO's of the
electronic ground states of these molecules. These nuclear isopotential contours can
serve as approximations of MIDCO's, and in general, they are suitable for an
approximate shape representation of molecules.

The isopotential contour surfaces of the nuclear potential have been referred to as
194 P.G.MEZEY

nuclear potential contours, or in short, NUPCO surfaces [1]. Some of their


advantages in molecular shape analysis problems are outlined in ref. [1].

The definition of NUPCO surfaces is based on the continuity properties of nuclear


potentials and on the fact that at infinite distance the nuclear potential vanishes. The frrst
condition ensures that isopotential surfaces can be generated, whereas the second
condition guaranties that for any positive contour value these isopotential contours are
closed (finite and bounded) surfaces. The nuclear potential V n(r) is a continuous and
differentiable function of the position variable r, as long as r does not coincide with a
nuclear position, r :# Ri .

The level sets F(a) and their NUPCO boundary surfaces G(a) for any constant
nuclear potential value a are defined as

F(a) = ( r: Vn(r) < a } , (2)


and

G(a) = ( r: Vn(r) = a } , (3)

respectively. The value of nuclear potential is always positive or zero; clearly, there are
no NUPCO's G(a) with negative threshold parameter a.

Although representations of molecular shapes based on the electronic density


(MIDCO's) differ somewhat from the shape representations using nuclear potentials
(NUPCO's), their similarity can be exploited. NUPCO surfaces provide a simple
approximation of the shapes of molecules. These surfaces have a major advantage
over the MIDCO surfaces: the computation of NUPCO's is a trivially simple task as
compared to the calculation of electronic densities. Of course, the nuclear potential is a
useful molecular property in its own right, without any reference to their similarity to
the shape properties of electronic density. A comparison of NUPCO's of various
molecules is a valid tool for evaluating an important aspect of molecular similarity.

It is important to note that the superposition of nuclear potentials of different sets of


nuclei can result in similar composite nuclear potentials. Consequently, the
comparison of NUPCO's is better suited for an assessment of molecular similarity
than a direct comparison of nuclear arrangements (nuclear configurations). In
particular, the symmetry of a NUPCO is at least the same as (and possibly higher
than) the point symmetry of the nuclear arrangement.

If a sequence of NUPCO's is calculated for a series of nuclear configurations K


occurring along a formal reaction path p of a chemical process (reaction or
conformational rearrangement), then the shape analysis of NUPCO's gives a formal
description of the collective shape properties of these nuclear arrangements K, as well
as that of the chemical process p.

One should note that only some of the essential chemical changes are reflected in a
NUPCO shape analysis; in some instances, electronic excitations have only negligible
THREE-DIMENSIONAL MOLECULAR BODIES AND THEIR SHAPE CHANGES 195

or no influence on the actual shape of NUPCO's. Evidently, NUPCO's are common


for all electronic states as long as the nuclear arrangement is the same. Consequently,
NUPCO's are not ideally suitable to account for the shape differences between various
electronically excited states of molecules, especially in formal "vertical" excitations. In
an ideal vertical excitation the relative nuclear geometry remains invariant.

Note, however, that electronic excitations are often accompanied by small changes of
the optimum nuclear arrangement K, and NUPCO's can be used for an approximate
description of these contributions to the overall shape changes caused by the electronic
excitations.

NUPCO's do not approximate MIDCO's uniformly for all choices of nuclear


potential or electron density values. A high degree of shape similarity between
NUPCO's and MIDCO's is expected in the high density and high potential core
regions near the nuclei, where both contours are essentially spherical. By contrast, in
the chemically most important valence shell regions of electronic charge density the
dominant electron-nuclear interaction is substantially modified by electron-electron
interactions, hence one fmds greater differences between the shapes of MIDCO's and
NUPCO's. Nevertheless, even in these peripheral, valence shell regions of molecules,
the similarities are significant, and NUPCO surfaces provide a valid approximate
model for molecular shapes. An evaluation of the similarities of NUPCQ's is a
computationally feasible tool for the analysis of the similarities of both small and large
molecules.

The analysis of NUPCO's also provides alternatives for various dot representations of
molecular shapes. A technique, called the fused spheres guided homotopy (FSGH)
method [1] is based on the generation of point sets on spherical surfaces about each
nucleus of a molecule in a systematic manner, designed to facilitate the construction of a
semi-uniform distribution of points ("dots") along MIDCO's, using simple
interpolation. The same technique is directly applicable for dot representations of
NUPCO surfaces. An improvement of the original FSGH technique is obtained if the
supporting, "guiding" set of spheres is replaced by a "guiding" series of NUPCQ's
G(ai) for a sequence of nuclear potential values ai' a2, ... , ai ' ... , am. Points
along these NUPCO's can then be used for the interpolation of electron density
values, eventually leading to a dot representation of the actual MIDCQ. Whereas the
computation of NUPCO's is somewhat more time consuming than the generation of a
sequence of spheres, NUPCO's provide a more faithful representation of the shape of
the MIDCO's, hence the approximate dot representation is also more accurate.

3. Topological patterns of NUPCO sequences


Consider a molecule of n nuclei, and order these nuclei into a sequence of decreasing
nuclear charge:

(4)
196 P. G. MEZEY

For the reaction path p considered, take the initial nuclear configuration,

KO= p(O). (5)

Take a high enough nuclear potential threshold value a 1 that fulfills the following
conditions:

(i) each maximum connected component of the G(KO,aI) NUPCO surface


contains precisely one nucleus,

(ii) the G(Ko,aI) NUPCO surface has the maximum number of maximum
connected components, subject to condition (i).

In other words, the various nuclear neighborhoods are not joined yet (condition (i»,
and G(Ko,aI) have the maximum number of such atomic neighborhoods (condition
(ii».

The maximum connected components of the corresponding NUPCO G(Ko,aI) are


denoted by

(6)

At the nuclear potential threshold aI, the NUPCO component enclosing nucleus j is
denoted by GIj(Ko,al). We assume that all these NUPCO components are
topological spheres.

This family of NUPCO components is represented by the numerical sequence

kll' k12' ... , kIn, (7)

where kIj is defined by

Zj, if the j-th nucleus is enclosed by a component of G(Ko,a 1)


k 1j = { (8)
o otherwise .

As the nuclear potential threshold al is decreased to a new value a2,

(9)
some of the components GI/Ko,a 1) and Glj-<Ko,a 1) may expand and join to form a
single maximum connected component G2/KO,a2). Here we assume that
j > j'. (10)

We choose the value a2 as the nuclear potential threshold where the first such joining
THREE-DIMENSIONAL MOLECULAR BODIES AND THEIR SHAPE CHANGES 197

of components occurs.

In general, a new threshold value ai is specified for each occasion where a


topologically significant change of the NUPCO components occur, that is, where two
or more components join or where their topology changes, for example, where a
toroidal NUPCO becomes a topological sphere. These critical threshold values form a
sequence,
(11)

where

m = m(K) = m(Ko) (12)

is the total number of topological types of the NUPCO's occurring at nuclear


configuration K = Ko.
For each critical nuclear potential threshold value ai, and for the corresponding new
family of maximum connected components G ijCKO,ai) of NUPCO G(Ko,ai), a new
numerical sequence is specified,
(l3)

where
(14)

index j' is the smallest index of any nucleus enclosed by the component Gij(KO,ai)
enclosing nucleus j, and where
(15)

One should notice that the imaginary component of each number k 11, k 12, ... , kIn
of the first sequence is zero that agrees well with the assumption on the topological
sphere properties of the initial NUPCO components. Consequently, the numbers k 11,
k 12, ... , k In which are real integers, also satisfy the conditions of the general
definition given for the numerical sequence kit, ki2' ... , kin of a generic index i.
In the general case, the numbers kil' ki2' ... , kin are complex with integer
components, including the possibility of zero imaginary parts. Clearly,

Gij'(KO,ai) = Gij(Ko,ai) (16)

for indices satisfying eq. (14).

For each nuclear configuration K there are m(K) such kil, ki2' ... , kin'
sequences, i=I,2, ... m(K). These sequences can be arranged into m(K) x n matrices
which can be augmented by mmax - m(K) rows of zeroes, where
198 P. G. MEZEY

mmax = max{m(K), K E pl. (17)

The resulting, augmented mmax x n matrices N(K) have the form

N(K) = . . . . . . . (18)
km(K) I km(K) 2,' . . . k m(K) n
o 0 0
o o .. O.
For the static nuclear configuration K the matrix N(K) describes the pattern of the
topological structure of the nuclear potential.

In addition to matrix N(K), an mmax-dimensional vector a(K) containing the critical


threshold values aI, a2, ... , am, and appropriate number of zeroes as elements can
also be specified,

a(K) = (at> a2, ... , am, 0, •.. 0)" (19)

where we consider column vectors and the symbol' stands for transpose. The matrix
N(K) combined with vector a(K) provides a more detailed description of the shape
of the NUPCO sequence of nuclear configuration K.
The information stored in matrix N(K) can be represented by a labeled graph d(K).
The n vertices of d(K) are labeled by the serial indices of the nuclei (the column
index j' of matrix matrix N(K». In addition, each vertex j' is labeled by a sequence
of complex numbers zj't, t=1,2, .. , defined by

Re(zj't) = i', (20)

where i' is the index of potential threshold value ai' where the t-th topological
change of Gi'j'(Ko,ai') occurs, and by

1m (Zj't) = A(genus) (21)

of the topological change, as long as for this change j' is the smallest nucleus index
within the NUPCO.

There is an edge from vertex j to vertex j' if at the nuclear potential threshold ai the
nucleus j is contained in the NUPCO component Gij'(Ko,ai), where j' is the
smallest index of any nucleus enclosed by the NUPCO component containing nucleus
j, and where ai is the largest potential threshold value where this holds. The OJ')
edge is labeled by index i.

The edge indices can be used to assign a direction to each edge, for example, the
THREE-DIMENSIONAL MOLECULAR BODIES AND THEIR SHAPE CHANGES 199

direction from higher to lower nuclear index as given in list (7), turning the edges into
arcs and d(K) into a digraph. Digraph d(K) is a discrete representation of the
topological pattern of the nuclear potential of the molecule.

An alternative digraph representation da(K) is obtained from d(K) if one replaces the
integer arc labels i with the real number label ai and the real parts i' of the arc labels
with ai' of the actual threshold values where the topological changes occur. This
approach takes into account all information represented by matrix N (K) and vector
a(K). Note that digraph da(K) is no longer a discrete representation of the nuclear
potential.

In the following discussion we shall use the matrix representations N(K) and vectors
a(K), convenient for computer manipulations.

4. Shape changes of NUPCO sequences along reaction paths and in


conformational domains
In chemical processes, such as the process modeled by the formal reaction path p, the
nuclear arrangement is not static, and the matrix N(K) as well as the vector a(K) also
change as the nuclear configuration K varies. Consider the path p as a mapping from
the unit interval I to the metric nuclear configuration space M [3],

p: 1=[0,1] ~ M, (22)

parametrized as

p = p(u), 0 ~ u ~ 1, (23)

where u=O corresponds to the initial point

KO=p(O) EM, (24)

whereas u= 1 corresponds to the nuclear configuration

Kf= p(l) E M (25)

of the formal product. Note that for most nuclear configurations K(u) along the path
p(u), a small displacement du does not alter the topological pattern N(K(u» of
NUPCO sequences,

N(K(u)) = N(K(u+ du)), (26)

although the numerical values of the critical threshold values, stored in vector a(K) are
likely to change:

a(K(u» ~ a(K(u+ du». (27)


200 P. G. MEZEY

Along the entire path p(u), there are only finite number of different N(K(u))
matrices of NUPCO sequences and path p(u) can be decomposed into a finite
number w of invariance intervals,

PN,1 , PN,2,'" PN,w . (28)

The topological pattern of nuclear potential and its variation along the path p can be
characterized by the sequence
N(p,I), N(p,2), .... , N(p,w) (29)

of matrices and by the w-l parameter values

UN,l , uN,2,'" uN,w-l ' (30)

marking the endpoints of the frrst w-l of the invariance intervals PN,l' PN,2, ...
PN,w-l along path p.
Note that a reparametrization of path p is always possible that can change the actual
UN l' UN 2 ' ... UN w-l values while preserving their monotonic increase along the
path p. rr, however,'the parametrization (23) reflects some physical condition, for
example, it is defined by proportionality with the distance given in the metric nuclear
configuration space M [3] then the UN 1 ' UN 2 , ... UN w-l values also reflect
physically relevant information. " ,

Two reaction paths p 1 and P2 are regarded shapewise equivalent within the above
context (N -shape equivalent) if and only if the numbers w land w2 of their
invariance intervals agree,

(31)

and the two matrix sequences

N(Pl,l), N(Pl,2), .... , N(Plow) (32)

and

(33)

are the same,

(N(Plol), N(P102), .... , N(PloW)} = (N(P2,1), N(P2,2), .... , N(P2,W).


(34)

The shapewise equivalence of reaction paths p land P2 according to a matrix


sequence (32) is denoted by
THREE-DIMENSIONAL MOLECULAR BODIES AND THEIR SHAPE CHANGES 201

PI N P2' (35)

The corresponding equivalence class is denoted by PI, where the index 1 is


inherited from the arbitrarily chosen representative path p I of the class Pl.
Evidently,
(36)

By a development entirely analogous with the homotopy equivalence classes of paths


and loops in configuration space, leading to the fundamental group of reaction
mechanisms [3], the above shape equivalence of reaction paths p in M generates a
complete shape classification of paths for the given stoichiometry of nuclei.
Note that this shape representation (N-shape) does not give information on the
electronic state, as NUPCO's are common for all electronic states of molecules.

Formal chemical species are represented by catchment regions C(A,i) of M with


respect to a specified electronic state and the associated potential energy hypersurface
E(K). A catchment region is-the collection of all nuclear configurations that are starting
points of steepest descent paths leading to a common critical point of the given potential
surface. In the C(A,i) notation A is the index of the critical point (A = 0 for a
minimum and A=1 for a simple saddle point of a transition structure), whereas i is a
serial index.

A subclass PI (C(A,i), C(A',i'» of class P I is the family of all paths from class PI
which start at the catchment region C(A,i), end at the catchment region C(A',i'), and
are homotopic to one another (continuously deformable into one another) while
preserving these properties. Evidently, the above relation is an equivalence relation
among paths, and P 1(C(A,i), C(A',i'» is an equivalence class. Such an equivalence
class P 1(C(A,i), C(A',i'» represents a formal reaction mechanism defined in terms of
shape (N -shape).

5. Shape changes of NUPCO's in conformational changes and in


molecular deformations; NUPCO shape invariance domains of the
configuration space.
In studies of molecular deformations preserving chemical identity, and in other
treatments of molecular motions, it is often inconvenient to rely on formal paths in
configuration space M. Instead, various domains of the configuration space M are
considered. The formal catchment regions C(A,i) of M with respect to a specified
electronic state and the associated potential energy hypersurface E(K) have been
suggested as representatives of configurational families preserving chemical identity
[3]. In the context of dynamic shape analysis and the dynamic shape space, originally
proposed for shape classifications in terms of the shape groups of the electronic
density, the nuclear configuration space M has been subdivided into topological shape
invariance domains [4]. Whereas this approach has lead to a detailed shape description
for the specified electronic state, a similar approach adapted to NUPCO's provides an
202 P.G.MEZEY

essential connection among the individual shape variations along the ground and excited
electronic state potential surfaces.

For each nuclear configuration K the NUPCO matrix N(K) is defined as in eq.
(18), using a modified choice for mmax '

mmax =max{m(K), K EM}. (37)

Alternatively, at the sacrifice of a unifom dimension of these matrices, the additional


zero rows of these matrices can be omitted. With either choice, the NUPCO shape
invariance domains of the metric nuclear configuration space M are defined in terms
of invariance domains of these matrices. For any nonpathological nuclear configuration
space M there are a fmite number q of different matrices N(K), listed according to
some representative nuclear configurations
(38)

hence there are only a finite number of NUPCO shape invariance domains by the
above matrix criterion,
(39)

The union of these NUPCO shape invariance domains is the entire nuclear
configuration space M,

M= UMN,k' (40)

The metric of the nuclear configuration space M allows one to introduce a measure of
volume V for subsets of M. Measures of the relative importance sc(k,i) of shape
type k (N -shape type k) for various individual chemical species C(I..,i) and the
contribution cs(i,k) of a chemical species C(A,i) to a given shape type k can be
specified by the following volume ratios:

sc(k,i) = V(MN,k n C(A,i) ) I V(C(A,i» (41)

and

(42)

respectively. Since the catchment regions give a complete partitioning of the


configuration space M (see ref. [3] for the treatment of pathological cases), and as a
consequence of relation (40), one obtains
L sc(k,i) = 1, (43)
k

and by taking into account the implicit A-dependence,


THREE-DIMENSIONAL MOLECULAR BODIES AND THEIR SHAPE CHANGES 203

I. cs(i,k) = 1. (44)
A,i

The above two relations can be used as a test for results obtained for individual shape
and species contributions.

The similarities between various reaction mechanisms can be analysed and quantified
by direct comparison of their matrix sequences. A numerical similarity measure of
reaction mechanisms is based on a measure of difference between the two matrix
sequences on the two sides of eq. (34): the smaller the difference, the greater the
similarity. The extreme case of shapewise equivalence of reaction mechanisms is
represented by eq. (34).

6. Local shape in variance of NUPCO's and the transfer of functional


groups in chemical reactions.
In an earlier study [5] a general scheme has been proposed for the analysis of the
interrelations and transformations of flexible functional groups within the configuration
space M of all possible chemical species of a fixed overall stoichiometry S. Using
topological criteria for the flexibility of various functional groups, and information on
the fundamental pattern of the distribution of configurational families within the nuclear
configuration space, a concise description has been given for the occurrence and
interconversion of various functional groups in various families of nuclear
arrangements during chemical reactions and conformational changes.

Distortions and strains in a molecule may alter the identity of a functional group. A
large enough local distortion of some molecular moiety may qualify as an actual
chemical reaction that changes a functional group to another. For example, by an
appropriate (large) local distortion, a - CH2 - 0 - CH3 group may get converted into
the group - CH2 - CH2 - OH.
A rather general treatment of such problems has been formulated [5] as follows.
Consider a family f of some functional groups f t generated by some or all of the
N atoms associated with the given stoichiometry S that defines the actual nuclear
configuration space M:
(45)

For convenience in the terminology, and for sake of generality, among these w
functional groups we may include two extreme cases as formal "functional groups":
the individual atoms, and entire molecules, possibly containing all the N atoms.
A larger functional group may contain some smaller ones, for example, the
(46)

group contains the


204 P.G. MEZEY

(47)

group. The above chemical inclusion relation is denoted by

ft < fs · (48)

Of course, it is possible that for many other pairs of functional groups of such a family
f no such containment relation exists; for example, this is the case for functional
groups f l' = - CH 2 - OH and f s' = - CH 2 - CH 3, since neither contains the
other. Since not all functional groups are interrelated, the relation < defines only a
partial order in the family f.

Two cases are of special importance. If there exists a functional group f 1 which is
contained in all the other functional groups f2' ... , fw of the family f,
(49)

then this functional group fl may serve as an infimum, and with relation < as the
partial order, f becomes a lower semilattice. For example, if the family f is such
that each member fs contains a common atom, for example, the H atom, then the <
partial order relation implies that family f is a lower semilattice, with fl = H as
infimum [5].

If the family f contains only one, unique isomer for the structure involving all the N
nuclei of the given stoichiometry S (where this isomer is regarded as a formal
functional group fw), then fw can be taken as a supremum, provided that
(50)

In this case, family f is an upper semilauice. If both relations (49) and (50) hold,
that is, if both infimum and supremum exists within family f, with respect to the <
partial order relation, then the family f of functional groups forms a lattice. Such
lattices are useful tools for systematic treatments of hierarchies.
The goal of the earlier study [5] was to generate a concise scheme for the study of the
presence and interrelations of functional groups among all possible nuclear
configurations of a given stoichiometry S. If two geometrical arrangements of a given
collection of atoms are similar enough, then we may consider these two arrangements
as representations of the same functional group. In order to account for the nonrigid,
flexible nature of functional groups, and for the allowed, minor geometry changes
which do not change their chemical identity, a topological "tolerance" range has been
selected for these groups. For each fuctional group f t a range Tt of allowed
geometrical arrangements is specified; within the given range each functional group f t
is regarded to preserve its chemical identity. The family of all these ranges Tt has
been denoted by T,

(51)
THREE-DIMENSIONAL MOLECULAR BODIES AND THEIR SHAPE CHANGES 205

All geometrical realizations of a functional group ft with geometrical arrangements


falling within the limitations specified by set T t are regarded as topologically
equivalent. Here each set Tt is taken as a topological "tolerance specification" of the
geometrical distortions preserving the chemical identity of functional group ft. In the
earlier study [5]. a nuclear displacement less than 25% of the van der Waals atomic
radius for each atom has been used as an example for a simple tolerance criterion.

In terms of NUPCO's a new set of topological criteria can be defined for the
preservation of some essential properties of functional groups. One advantage of a
NUPCO analysis is that molecular fragments can be easily identified. simply by taking
a subset of the nuclei and the nuclear potential generated by this subset. The modeling
of functional groups can be based on such subsets. The entire treatment described for
molecular NUPCO's G(K,aj) in the previous sections is directly applicable for
functional groups. leading to an identifiable NUPCO.
(52)

for each functional group ft of the family f.

Another advantage is the fact that the G(ft.K.aj) functional group NUPCO's are
additive. as implied by eq. (1). If the members ft of set f={fl,f2, ... ,ft , ... ,fw }
of formal functional groups are mutually disjoint,
(53)

and if the union of these functional groups is the given molecule c(K) of nuclear
configuration K,

w
c(K) =U ft , (54)
t=1

then the molecular NUPCO G(K,aj) of the molecule c(K) can be obtained as the
sum of the individual functional group NUPCO's G(ft,K,aj),

w
G(K,aj) = 2. G(ft,K,aj), (55)
t=1

If the NUPCO analysis techniques described in the earlier sections are applied to
individual functional group NUPCO's G(f t,K,aj). then the new set of criteria are also
given in terms of configuration space invariance domains of these very functional group
NUPCO's G(f t,K,aj). This approach gives a characterization that is finer than the
mere preservation of chemical identity of a functional group ft.

By analogy with the molecular case, for each nuclear configuration K the functional
group NUPCO's G(ft,K,aj) generate a functional group NUPCO matrix
206 P. G. MEZEY

(56)

for each fonnal functional group ft. Matrix N(ft,K) is defined as in eq. (18), using
an appropriate choice for lIImax '

mmax = max{m(K), K EM}. (57)

The functional group NUPCO shape invariance domains of the metric nuclear
configuration space M are defined in terms of invariance domains of these N(ft,K)
matrices. Additional conditions can be specified in order to avoid treating dissociated
fragments as formal functional groups, for example, by specifying a subset of the
configuration space M where only very low nuclear potential contours enclose all of
the (distant) nuclei of the formal functional group ft. This subset can be taken as the
collection of nuclear configurations K where ft is not realized as a chemically
recognizable functional group.

With the above provision, for any nonpathological nuclear configuration space M
there are only a finite number q of different functional group NUPCO matrices
N(ft, K v),

(58)

listed according to some representative nuclear configurations

K 1, K2,···, K v, ... , K q , (59)

Consequently, there are only a finite number of functional group NUPCO shape
invariance domains by the above matrix criterion, where these invariance domains are
denoted by

MN,t,l, MN,t,2,···, MN,t,v , ... , MN,t,q' (60)

using the same ordering as in eqs. (58) and (59).

The union of these functional group NUPCO shape invariance domains is the entire
nuclear configuration space M,

q
M= UMNtv. (61)
v::l "

The above partitioning scheme of the nuclear configuration space M can be repeated
for each functional group f t of the family f. When the intersections of all these
invariance domains are consIdered collectively, then one obtains a (usually) finer
subdivision of the nuclear configuration space M, where each such domain MN f tv
is defmed by the condition that within a given MN f t v the invariance is valid for each
functional group ft of the family f, and each MN:f,t,v is a maximum connected set
THREE-DIMENSIONAL MOLECULAR BODIES AND THEIR SHAPE CHANGES 207

with this property. Of course, the union of all these collective NUPCO shape
invariance domains offunctional groups f t of the family f also generates the entire
nuclear configuration space M,

w q
M= U U MNf . (62)
t=l v=l ' ,t,v

The nuclear configuration space M can be characterized by the distribution of either the
MN t v subsets or that of the more detailed M N,f tv subsets of M. In either case, one
can 'apply the definition of a neighbor relation n'(A,B) between two arbitrary subsets
A and B of a nuclear configuration space M [3],

I if A n cLos(B) ¢ 0 or cLos(A) n B ¢ 0 ,
n(A,B) = { (63)
o otherwise,

where cLos(A) is the closure of set A in the metric of space M (in simple terms, the
set A together with all its boundary points).

U sing the above neighbor relation, the pattern of NUPCO invariance domain
distribution of the configuration space M can be characterized by graphs. Two
graphs, gN,t and gN,f are defined as follows. The vertex sets V(gN,t) and V(gN,f)
of these graphs are

V(gN,t) = { MN,t,v , v=1,2, ... q} , (64)

and

V(gN,f) = { MN,f,t,v, t=l, ... w, v=1,2, ...q } , (65)

respectively, whereas the edge sets E(gN t) and E(gN f) of gN t and gN f,


respectively, are defined by the nonzero neighbor relations 'among the corresponding
vertices,

E(gN,t) = {(MN,t,v , MN,t,v') : n(MN,t,v, MN,t,v') =1, v,v'=l, ... q, v¢v'} , (66)

and

E(gN,f) = {(MN,f,t,v , MN,f,t',v') : n(MN,f,t,v , MN,f,t'v') =1, t,t'=l, ... w,


v,v'=l, ... q, ()t,t' ()v,v' ¢ I} , (67)

respectively.

These two graphs provide a detailed, global description of invariance domains, and
concise information on how various functional groups are interrelated, transformed or
carried through approximately intact during various chemical reactions.
208 P. G. MEZEY

Summary
After a review of the basic concepts of the topological shape analysis methods, a simple
technique is described for a discrete representation of molecular shapes by the
topological patterns 0/ contour sUrfaces o/three-dimensional nuclear potentials. This
technique is extended for modeling of shape changes in chemical processes. A family
of matrices of integer elements (and the corresponding graphs of integer labels) as well
as various shape invariance domains of the nuclear configuration space are introduced.
Formal reaction paths are characterized by the finite sequences of matrices occurring
along each path. Equivalence relations among formal reaction paths based on these
topological properties lead to a shape-based definition of reaction mechanisms.
Additional relations are specified for the shape characterization of chemical species. The
equivalence classes of these finite matrix sequences provide a shape-based description
of formal reaction mechanisms. Similarities between reaction mechanisms can be
studied by comparing their matrix sequences. These similarities can be quantified,
leading to numerical similarity measures of reaction mechanisms.

References

[1] Mezey, P.G. (1993) Shape in Chemistry: An Introduction to Molecular Shape


and Topology, VCH Publishers, New York.
[2] Parr, R.G. and Berk, A. (1981) "The Bare-Nuclear Potential as Harbinger for the
Electron Density in a Molecule", in P. Politzer and D.G.Truhlar (eds.), Chemical
Applications of Atomic and Molecular Electrostatic Potentials, Plenum, New
York, pp. 51-62.
[3] Mezey, P.G. (1987) Potential Energy Hypersurfaces. Elsevier, Amsterdam.
[4] Mezey, P.G. (1988) "Shape Group Studies of Molecular Similarity: Shape
Groups and Shape Graphs of Molecular Contour Surfaces", 1. Math. Chern. 2,
299-323.
[5] Dubois, J.-E. and Mezey, P.G. (1992) "Relations Among Functional Groups
Within a Stoichiometry: a Nuclear Configuration Space Approach", Int. 1.
Quantum Chern. 43,647-658.
THE INVARIANCE OF MOLECULAR TOPOLOGY
IN CHEMICAL REACTIONS

EUGENY V. BABAEV
Moscow State University
Department of Organic Chemistry
Moscow 119899 Russia

1. Introduction
There are two different pictures of molecular structure: the classical and the quantum-
mechanical. The classical picture is naive-empirical and is the chemical one; it is
connected with classical structural formulae, ball-and-stick models, the phenomenological
Lewis concept and the Gillespy rules for prediction of molecular geometry. This picture
now endures as the heuristic instrument for the planning of chemical synthesis, for
communication between experimental chemists, and for chemical education. The
quantum-mechanical picture is the physical one; it is based on the application of quantum
mechanical ideas to molecular structures and on quantum-chemical calculations of different
degrees of sophistication. Many attempts have been made in theoretical chemistry to find
some symbiosis between these two different levels of description of molecular structure;
only in recent years the desired compromise seems to have been found in the topological
nature of both the quantum-mechanical and classical models of the molecular structure.
Topology is not just graph theory, and similarly chemical topology is not just the use of
a graph as an image of a molecular structure or chemical reactionl&,z as it is usually
considered. 3 One of the main ideas in classical topolog~ is to study spaces which can
be continuously deformed into one another, and to fmd the invariants of such spaces.
Some known chemical applications of these ideas (e.g. the topological invariants of sur-
faces and their critical points) are used to describe electron density mapslb or potential
energy surfaces1c; some topological invariants of the polyhedrons are also used to
understand the electron-counting rules in the chemistry of clusters. 7 In the cited
approaches, the ideas of topology are applied to the quantum-chemical picture of mole-
cular structure. It seems that there is only one work1d devoted to the topological
description of classical structures and the electron-counting rules for usual molecules with
localized bonds.
It is the aim of this paper to introduce special spaces, two-dimensional manifolds or
surfaces, as new images of molecules with localized bonds, starting only from the classical
picture of the molecular structure. One can easily get these surfaces from graphs
corresponding to the usual Lewis diagrams of molecules. Some qualitative chemical
concepts, which are rather poorly formalized in the language of graph theory, seem to be
more clear from the point of view of surface topology. Moreover, because the topological
209
D. Bonchev and O. Mekenyan (eds.), Graph Theoretical Approaches to Chemical Reactivity, 209-220.
© 1994 Kluwer Academic Publishers.
210 E. V.BABAEV

invariants of the surfaces are based on the usual chemical electron-counting rules, it seems
that the general classical pictures of molecular structures and reactions is closer to
manifold topology than to the graph-theoretiCal. description. The suggested approach and
its further development seems to be a new branch of interaction between topology and
chemistry.

2. From a Lewis Diagram to the Pseudo-graph and Graphoid

Consider a Lewis diagram L(M) = (Z,N, (qJ) of a molecule M with localized bondsla and
with Z valence electrons and N atoms, where the i-th atom contains qj valence electrons
(for the non-transition elements qj coincides with their group number in the Periodic
System). For the given Lewis diagram the unique molecular pseudo-graph (a multi-graph
with loops~ G(M) = (V,R, (deg vJ) can be found, where the number of vertices V is the
same as N, the number of the edges R is equal to Z12, the degree of any i-th vertex deg
Vi is equal to q;, and any loop of the graph G(M) corresponds to a lone pair in the starting
diagram L(Mj" (Chart la). This defmition (the importance of which has been discussed
earlier from different points of view '()'12) connects the Euler equation for a (pseudo)graph9
with the valence electron count in a molecule:

L deg vi. = 2R = L qj = Z (la)


V N

(A somewhat similar definition of a molecular pseudo-graph has been independently used


by Kwasniska l3 in his graph-theoretical approach to organic reactions.)

If the starting molecule contains Z valence electrons and if L of them are unpaired, then
the corresponding topological image of a Lewis diagram is no more the pseudo-graph.
Let us call a graphoid G'(M) = (V,R,L,{deg vJ) the object which one can get from the
(pseudo)graph G(M) = (V+L,R+L, (deg vI) by deleting L free (terminated) vertices but
not the edges incident to them. Any graphOld has two sorts of edges, R usual and L hemi-
edges, as well as two sorts of vertices, V usual and L pricked, i.e., it has as its subset a
(V,R)-(pseudo)graphI4. It is obvious that the usual edge of a graph in the topological sense
is homeomorphic to the closed interval [a,b], while the herni-edge (without one vertex or
point) in G'(M) is homeomorphic to the one-side open interval [a,b). On the Chart Ib this
type of hemi-edge is shown as the line starting from a vertex to infinity. Because these
hemi-edges participate L-times in the sum deg v,, the Eq. (la) for the open-shell molecules
and their graphoids should be written as in Eq-.(lb):

L deg Vi = 2R - L (lb)
V -

We want to mention that in both of the above equations the equality ql = deg Vj for the
i-th atom is conserved 14. This means that for any molecule which can be described by
more than one Lewis diagram, only one resonance structure (perhaps a non-octet one)
should be chosen to construct the pseudo-graph (graphoid) due to this equality. In the
case of charged molecules (as well as ylides or betaines) the charges should simply be
localized on the appropriate atoms and the necessary number of protons should be added
or deleted in these nuclei to get a neutral isoelectronic species with the corresponding
THE INV ARIANCE OF MOLECULAR TOPOLOGY IN CHEMICAL REACTIONS 211

Chart la .

Molecule
..
:!Fz : SF .. SFs

Pseudograph
~ W
*
Chart. lb.

Molecule
..
oNF z
..
....
;0-0:
..
.~.

Graphoid n # +

Chart. lc.
+ •• -
NH 3 CH z

..
GeH3AeHZ
212 E. V.BABAEV

change in the qj and deg Vi value. 11 Thus, the isovalent molecules CJIs' and CH30H 2+,
as well as their isoster 'BH30H/ and the ylide +NH3CH2' which are isostructural to the
neutral CH,NH2 after this "charge annihilation," have the isomorphic unlabeled pseudo-
graphs (Chart Ic).

It is easy to get cyclomatic number C for any connected pseudo-graph G(M) (see the left-
hand equality of Eq.(2». All the loops and the independent cycles between the multiple
edges are also included in the cyclomatic number. 9 For graphoids G'(M) L hemi-edges
do not participate in any cycle; that is why one should cut them and calculate the C-value
by using the same equation for the (V,R)-subgraph of the G'(M). In general, the
cyclomatic number for any Lewis diagram has a simple chemical sense as the sum of the
(independent) cycles, multiple bonds, and lone pairs, and is determined only by the
balanced equation between the number of valence electrons, atoms, and unpaired electrons
(see the right-hand equality of Eq. (2»:

C = R- V +1 = ~ (Z - L) - N +1 (2)
3. From Graph (Graphoid) to Surface
Consider any (pseudo)graph or graphoid to be in the real three-dimensional space R3. Let
us add to any edge and vertex a very small volume of surrounding space. This operation
not only conserves completely unchanged the starting graph(oid) structure, but it also adds
a new interesting property to the starting object. Now a fWo-dimensional boundary exists
between the internal and external parts of a graph in R3. Consider our graph to consist
of empty rubber tubes (edges) which are also empty in their cross-sections (i.e. in the
internal vertices), but they are closed in the places of the usually terminated vertices and
open on the ends of the hemi-edges.

It is obvious that the resulting object is the two-dimensional manifold in R3 or the fwO-
dimensional sUrface S(M) corresponding to the starting Lewis diagram L(M). By a simple
continuous deformation one can easily get some canonical form of this surface, e.g. a
sphere with C-handles and L-holes or S(C,L), (see Chart 2a,b). This surface is orientable;
it is closed if L=O and open if L differs from zero. It can be found elsewhere that the
pair (C,L) is quite enough to classify all non-homeomorphic orientable and connected R2_
surfaces,~

The connected R~-surfaces S(C,L) can be described by their Euler characteristic X which
is one of the topological invariants, i.e. it is unchanged on topological deformations.~
It is not necessary to make a triangulation of the surface to get the X value: it depends
only on the number of holes L and handles C (see the left-hand equality of Eq. (3».4 The
use of Eq. (2) shows that for the starting Lewis diagram L(M) its Euler characteristic X
depends simply on the balance between N and Z, (see the right-hand equality of Eq, (3»:

X = 2-2C-L = 2N~Z (3)


4. What is the Topological Homeomorphism from the Chemical Point of View?

The resulting map L(Z,N,L,{qJ) = > G'(V,R,L, {deg vJ) = > S(C,L) distributes all
the Lewis diagrams on the homeomorphism of their surfaces S(M) on equivalence classes.
THE INVARIANCE OF MOLECULAR TOPOLOGY IN CHEMICAL REACTIONS 213

Chart 2a.

Chart 2b.
Structural Graph(oldl Surface InvarIants
formula C L X

CH~
.f-. Sphere
0 0 0 2

CH 3

.
+
-< Sphere
0
~
0 0 2

.t-
Hemisphere
CH 3 (Plane) 0 1 1

:CH;
J. Torus
G ! 0 0

:CHz(s)
c< Torus
G ! 0 0

.CH z ( tJ
~ CyJi nder
{} 0 2 0

CHz=CH z -;::x: Torus


(0 1 0 0

CHz-CH z
H Cy Ii nder
U 0 2 0 I
,,

I
CHz=CH-CH z

~I Handle
~
! 1 -1
214 E. V. BABAEV

Topological homeomorphism in mathematics is a very crude property showing a similarity


of surfaces; nevertheless, this type of topological identity seems to be the important one
as a first step in understanding the similarity in a geometrical sense. Following this
analogy, it is interesting to compare the homeomorphism of Lewis diagrams as a crude
test of the similarity of the structures and chemical behaviors of the corresponding
molecules.

There are some general empirical types of chemical similarity both of organic and
inorganic molecules (see references 11 and 12) which are based on the usual isoelectronic
or 1T-isoelectronic analogies, isostructural and homological series, etc., for example:

a) isovalent molecules differing only by the number of the period of any atom in
the molecule (e.g. CHfiH:z-SiHfiHl-CHJ'H:z-SiHJ>Hl-GeH~Hl etc.);
b) isovalent molecules differing in charge (e.g. H30+-NH3-CH,-);
c) isosters (alkanes-borazines; CO-N2; COl-NlO etc.);
d) any number of the resonance structures;
e) all types of tautomers and isomers;
f) classical homologs, differing by one or more CHl-group;
g) 1T-isoelectronic molecules with the same number of 1T-electrons (e.g. pyrrol -
benzene - borepine, or "pseudoazulenes": azulene - indolizine - pyrrolo l •2 -
aimidazole);
h) 1T-isoelectronic molecules with the same number of 1T-electrons differing in the
charge (cyclopentadien yI-anion -benzene-tropili um -cation);
i) members of isostructural series of boron hydrides differing in the BH-fragment
(isostructural c/oso-, nido- or arachno- series, see reference 8b).

All the members of each of these series a) - i) have topologically identical (homeo-
morphic) Lewis diagrams.

It should be mentioned that the homeomorphism in the series a) - e) simply follows our
definition of L(M), G'(M), and S(M) (Chart lc), while the topological identity of the
molecules in the series f) to i) (differing by the well-known homological fragments -CH2-,
-BH-, and -CH+-) proves that the concept of the homeomorphism is a very natural and
reasonable one for further chemical applications.

It is a well-known phenomenon in mathematical chemistry that some properties of


molecules are very similar when the topological indices of their molecular graphs (e.g.,
the Randic, Hosoya, or Wiener number) coincide. 1..2.15 Corresponding to this rule, X
should be considered as some global (in comparison with the other indices) index of the
molecular structure. Its degeneration [or the topological identity of S(M)] corresponds to
some global chemical similarity of the molecules. This is the case for the examples of the
series a) - i) mentioned above. Another example is the well-known empirical chemical
analogy between lone pair, double bond and 3-4 membered cycles l6 ; this fact should
correspond to the topological homeomorphism of these structural fragments to a torus.

On the other hand, the difference in the x-value [or in the genus of the closed surfaces
S(M), i.e., the number of handles C for the non-radicals] permits us to classify the non-
homeomorphic types of molecules in a linear order as is usual for orientable surfaces in
topology. ~ The simple chemical sense of the C-number is clear: it is a generalization
THE INVARIANCE OF MOLECULAR TOPOLOGY IN CHEMICAL REACTIONS 215

to any inorganic compound of the common organic-chemistry idea of the degree of


saturation. For any homological C.H2n+! series X it is simply its Euler characteristic; any
molecule could be more saturated, not only by hydrogenation, but also by coordination
with an electrophile.

5. The lnvariance of the Euler Characteristic in Chemical Reactions.

In the classical Lewis concept of the two-electron and two-centered bond there are onl y
two possibilities to form or to break the bond: The homolytic and the heterolytic. In the
simplest case the hydrogen molecule [for which S(M) is a sphere, X=2] could be formed
from two atoms (each of which is homeomorphic to a sphere without a point or to a
hemisphere, X=I) or from a proton and a hydride ion (a sphere, x=2, plus a torus,
X=O). From the surface topology point of view it means gluing the surfaces to a sphere
in this manner: to glue the I-dimensional cycles of the hemispheres in the first case, or
to glue the sphere into the hole of the handle in the second case. It is important that in
the both operations the Euler characteristic X is the additive value. Other examples also
prove this consideration (Chart 3). This principle can be generalized to be the Main
Theorem.
6. The Main Theorem
The total Euler characteristic of the Lewis diagrams with localized bonds stays
unchanged in chemical reactions.

6.1. Proof
Consider an ensemble of K, molecules (N" A" and L, are the general number of atoms,
and valence and unpaired - electrons, in -the ensemble) which transforms during the
chemical reaction to a new ensemble of the Kr molecules (where s and f indicate starting
and final) with corresponding values Nr, Ar and Lr. The non-connected graphs (graphoids)
with K, (Kr) components and corresPonding v: and R, (VI and Rr) are determined as
mentiOlled above for the Lewis diagrams of the starting and final ensembles corresponding
to Eqs. (Ia) and (Ib).

For any non-connected graph with K components, Eq. (2) should be changed to Eq. (4)
[see the left-hand equality of Eq. (4)]: and after mapping from the graph to the surface
with K components'-6 the left-hand equality of Eq. (3) should be changed to Eq. (5):

C R-V+K = ~(Z - L) - N + K (4)


X = 2K - 2C - L (5)

The resulting Euler characteristic X. for the ensemble of the molecules after the
combination of the Eq. (5) with Eq. (4) is equivalent to the right-hand equality of Eq. (3):

X 2K-2C-L = 2K-2{~(Z-L)-N+KJ-L

2K - Z + L + 2N - 2K - L = 2N - Z
216 E. V.BABAEV

Chart 3.

Examples of bonds break/formation:

Heterolytical Homolytical

CH 3
+ + CH 3
-
H-
~ C H& ~ CH
+
C2HS + 3 + CH 3
HT
~ C2H.S
C2H~ 2
.
+

H2 C2H"
~ H
+ +

CH" + :CH 2

Gluing of surfaces

()@
::.. : . .~ ..J
:~:<:.:~.. , .+ ......:~:.~~;.;:.:.: >:
~

0 00
,-", "
", .~

" .
" .t .• •
- t:':·
~.;.
.•.~ :... .
... :
~:;: . " .
...,:
.t.:,:.~

Euler characteristic

x: 2 + 0 2 1 + 1
THE INVARIANCE OF MOLECULAR TOPOLOGY IN CHEMICAL REACTIONS 217

Comparing the values X! and Xl for the starting and fmal ensembles of the molecules one
can easily get Eq. (6):

(6)

which is equal to zero due to the conservation of the valence electrons and the atoms in
a chemical reactions. Thus, the Main Theorem is proved.
6.2 Discussion
The principle corresponding to the Main Theorem we call the conservation of molecular
topology in chemical reactions. It is of interest that the conservation of the pure
topological property X in classical chemistry follows from the conservation of N and Z,
i.e., from the physical conservation of matter and charge. One can say that an imaginary
space with classical chemical structures is mapping to itself during the chemical reactions.
The invariance of X is not dependent on the changes of neither the number of molecules
(.1K), nor on the unpaired electrons (.1L), nor the sum of the lone pairs, multiple bonds
or cycles (the degree of saturation, .1C alone). Because all the members of the triad
(C,L,K) are topological invariants in the surface topology, the combination ofEqs. (5) and
(6) gives Eq. (7), which is an important chemical consequence:

Ax = 2 .1K - 2 LiC - .1L = 0 (7)

It follows from Eq. (7) that only five types of interconversions of topological invariants
(K,C,L) are permitted in chemical reactions for molecules with localized bonds:

.1L = 2.1K (7a)


LiC = LiK (7b)

.1L = - 2 .1C (7c)


.1L 2(.1K - LiC) (7d)
LiK = .1L = .1C (7e)
(where 11 corresponds to the difference between the final and starting parameters). All the
possible types are symbolically shown on the Chart 4.
All of Eqs. (7a) - (7e) simply follow from the Eq. (7): when one member of the triad
(C,L,K) is conserved in a reaction, the two others interconverse according to Eq. (7); the
conservation of only the two parameters (i.e. the sudden appearance or disappearance of
only one invariant) is impossible in chemistry. For instance, a handle (i.e., lone pair,
double bond, or cycle) can appear in a chemical reaction from any of the following:
a) the immediate disappearance of only two holes [see Eq. (7c)]), corresponding
in chemistry to intramolecular radical recombination, including triplet-singlet
transformations of biradicals,
tv
00

Chart 4.

Interconversed Conserved Balance ExamEles


invarIants Invariants Eqn- Chemical equation Gluing of surface.

K • L C (7.1 X + Y = X-y
U~~O
I •• (-J (oj
X • C L (7bl X + Y = X-y -->
>--
I 80 0
L • C K (7c1 I X Y = X-y -->
I "--.J ~
I
Cj C5
, eX' ·A -e-A)
K • L • C - (7dl
I y. 'B) - Y-B
I ~ l) ~G
(-) (+)

- K • L • C (7el x: Y = X-Y -->


ij >--
~ U @ @
II I

rn
:<
0::1
6;
;..
tTl
<
THE INV ARIANCE OF MOLECULAR TOPOLOGY IN CHEMICAL REACTIONS 219

b) the immediate appearance of a new component [see Eq. (7b)], e.g., conversion
of alkanes to cycloalkanes,
c) corresponding changes in the number of holes and components [see Eq. (7d)] ,
e.g., formation of cyclobutane from two triplet ethylenes,
d} the appearance another handle; the appearance of a handle from nothing is
forbidden. A good example is the well-known cycle-chain tautomerism: it only
seems that a cycle is built from a chain. The cycle which is usually formed,
e.g., from an electrophile-nucleophile interaction, has already existed in the
"chain" as a lone pair (Le., hidden cycle) on the nucleophilic center.

The suggested five types of the conservation and interconversion of the topological
invariants are good starting points for the further topological classification of chemical
reactions. Each type should be subdivided to the different classes, e.g., on the
redistribution of the invariants between different surfaces, following to the size of cycles,
etc.

7. Conclusion

The discussed novel approach could be considered as the first step in our program of
"topologization of chemistry" starting from a classical, and not quantum-mechanical, point
of view. This gives possibility for physicists to better understand the logic of classical
chemistry; for chemists to prove once more that chemistry is not only a descriptive
science, but also an exact science; and for mathematicians to fmd new fields of
application. In our further communications we intend to apply some other ideas of
manifold topology (fundamental and homology groups, topological images of hypergraphs,
etc.) to other classical concepts of chemistry (localization and delocalization, conjugation
and hyperconjugation, 7f-rich and 7f-deficient molecules).

8. Acknowledgements

I thank my colleagues, chemists at the Lomonosov University in Moscow, for the third
Lomonosov award, which was awarded for this work. I also thank topologists Professor
A. T. Fomenko (Moscow, Russia), Professor H. Zieschang (Bochum, BRD) and physicist
Professor R. Hefferlin (Collegedale, TN, USA) for fruitful discussions.

9. References and Notes

1. a} R.B. King, ed., Chemical Applications of Topology and Graph Theory;


Studies in Physical and Theoretical Chemistry, Vol. 28 (Elsevier, Amsterdam,
1983); references are from the Russian translation (Mir Publ., Moscow., 1987); b}
R.F.W. Bader, p. 54 in ref.1a; c) P.Mezey, p. 91 in ref.1a; d) M.J. McGlinchey,
Y: Tal, p. 148 in ref.1a.
2. A. T. Balaban, ed., Chemical Applications of Graph Theory (Academic Press, lon-
don, 1976).
3. Book Review: P. Mezey, J. Comput. Chern., 12(1991)139 about R. E. Merrifield,
H. E. Simmons, Topological Metluxis in Chemistry (Wiley Interscience, New York,
1989).
4. H. Seifert, W. Trelfall, A Textbook of Topology, Pure and Appl. Math. Ser., 1980.
5. P. J. Giblin, Graphs, Surfaces and Homology: An Introduction to Algebraic Toplr
logy, 2nd edn., (Chapman and Hall, New York, London, 1981).
220 E. V.BABAEV

6. A. Mishchenko, A. Fonmenko, A Course of Differential Geometry and Topology


(Trans!. from Russ., Mir Publ., Moscow, 1988).
7. D. M. P. Mingos, R.L. Johnston, in SlTUCtUTe and Bonding, Vol. 68 (Springer-
Verlag, Berlin, Heidelberg, 1987), p. 29. R. B. King, D.H. Rouvray, J. Amer.
Chern. Soc. 99(1977)7834; R. B. King, Inorg. Chim. Acta, 116(1986)99; B.K.
Teo, Inorg. Chern., 23(1984)1251.
8. a) The discussed models are only molecules with localized bonds. The main ide
of the approach is to find topological properties [see for example Eq. (3)], which are
independent of the pictorial representation of molecular structure by a graph and
which are determined only by the electron count in the molecule, with the aim of
further generalization of the approach to delocalized systems.
b) The empirical classification of the family of boron hydrides to closo-,
nido-, and arachno- structural types connects with electron-counting rules (see ref. 7);
it is impossible to construct connected pseudographs of this series and to compare
their topology with those for molecules with localized bonds. The use of Eq. (3)
opens this possibility; the corresponding X values are 2, 4, and 6.
9. F. Harary, Graph Theory (Addison-Wesley, Reading, MA, 1969).
10. a) E. V. Babaev in Proc. of Conference of young scientists (Moscow Univ., 1986)
p. 154 (Russ. Ref. Journal 6B(1987) 1107); b) E. V. Babaev in Proceed. of 7th All-
Union Coriference: Computers in Chemical Research (Riga, 1986), p. 210 [Russian
Ref. Journal 7A(1987)36].
11. E. V. Babaev in Principles of Symmetry and Systemology in Chemistry, N. F.
Stepanov, ed., (Moscow University, 1987) [Chern. Abstr. 109, No. 27902b].
12. E. V. Babaev in History and Methodology of Natural Science: Vol. 35. Philo-
sophical Problems of Chemistry. A. P. Rudenko, ed., (Moscow Univ. PUbl., 1988)
p. 121-140.
13. V. Kvasniska, Coli. Czech. Chern. Comm., 48(1983)2097.
14. Values V, Rand (deg vJ are topological invariants of (pseudo)graph and the corres-
ponding values Z, N and (qJ are the chemical invariants. It is impossible,
staying only with the concept of classical graphs, to conserve Eq. (la), which
connects these two invariants for the open-shell molecules. In some approaches (e.g.
N. S. Zefirov, S. S. Tratch, G. A. Gamiani, Zh. Org. Khim. 22(1986)1341) an
unpaired electron is considered as a label (phantom-centre) on the additional
terminated vertex, i.e., Eq. (la) is violated.
15. M.l. Stankevitch, 1. V. Stankevitch, N. S. Zefirov, Uspekbi Khimii (Russ.) 57 vol.
3(1977)337.
16. The simplest example of the analogy between lone pairs, double bonds, and small
strained cycles is the well-known electrophylic addition reaction, e.g. protonation.
We mention that during protonation all the starting structural fragments,
homeomorphic to a torus, are transformed to cations, homeomorphic to a sphere.
We mention also that this analogy proves itself also for other strained cycles in
polycyclic systems, but for larger cycles this analogy has only theoretical interest.
TOPOLOGICAL INDICES AND CHEMICAL REACTIVITY

OV ANES MEKENY ANI and SUBHASH C. BASAK 1


lHigher Institute of Chemical Technology, 8010 Burgas, Bulgaria
lNatural Resources Research Institute,
University of Minnesota, Duluth, MN 55811

1. Introduction

Chemical reactivity can be defined as the ability of the molecular structures to take part
in the electronic rearrangement processes during chemical interactions. As electronic
processes, one can consider hard (charge-charge) and soft (charge-transfer) electronic
interactions as suggested by Klopman and Hudson in their polyelectronic perturbation
theory [1-3], as well as weaker interactions such as dipole-dipole, hydrogen bonding
effects which can be considered as particular cases of the above two main types of
electronic rearrangements. Reactivity determines the interaction of molecules with other
chemical species in its environment. For example, the ability of chemicals to take part
in charge-charge electrostatic interactions modulate their hard electrophilicity and
condition the nature and extent of alkylation of nucleophiles by electrophiles. The ability
of polar-polar (dipole-dipole) and hydrogen bonding interactions affect the behavior of
solute in the solvent and partitioning of molecules between different phases. Such
consequences of reactivity on the properties of chemical species may be termed primary
effects. On the other hand, for many physiologically active molecules, the primary effects
have important biological consequences-determining their interactions with critical target
biomolecules. The latter can be specified as secondary effects. Thus, carcinogenicity and
mutagenicity of chemicals are believed to be due to the alkylation of critical
biomacromolecules by the chemicals themselves or their reactive metabolites produced in
vivo.

Chemists have been interested in discerning the structural factors underlying molecular
reactivity. The relationship of molecular topology to chemical reactivity is of interest for
both theoreticians and experimentalists. The quantifiers of molecular topology (e.g.,
topological indices) have been useful as reactivity parameters for many classes of
chemicals such as acyclic hydrocarbons, alkyl benzenes, benzenoid hydrocarbons, etc. The
focus of the present work is to discuss the basic principles underlying the topological
foundation of molecular reactivity, to give a comprehensive account of topological
invariants which can serve as reactivity indices as well as to demonstrate applicability of
some of these topological parameters.

2. Basic Principles Underlying the Topological Nature of Chemical Reactivity

The basic concept which determines the topological conditioning of reactivity is the first
principle of organic chemistry, viz., the principle of Molecular Structure. According to
221
D. Bonchev and O. Mekenyan (eds.), Graph Theoretical Approaches to Chemical Reactivity, 221-239.
© 1994 Kluwer Academic Publishers.
222 O. MEKENY AN AND S. C. BASAK

this principle, molecules are considered as isolated objects, possessing a relatively rigid
and permanent location of nuclei. Hence, they are assumed to have a structure, which
conditions their physical and chemical properties. As a consequence of this principle, it
is assumed that molecular structure can be adequately described [4,5]. Three components
of molecular structure can be distinguished [6]: topology, metric, and electronic
distribution.

Molecular topology is defined only by the binary relation between atoms in a molecule
determining whether they are bonded or not [7-9]. This relationship is usually termed
molecular connectivity, and it can be derived from so-called molecular graphs [9-11].
Simple chemical graphs are mathematical structures, where the nature of atoms and type
of bonds is neglected. They can be constructed by depicting each atom by a vertex and
connecting a pair of vertices by an edge when the corresponding atoms are bonded in the
constitutional formula. Usually, in this mathematical representation of molecules, the
hydrogen atoms are neglected, thus arriving at the respective skeletal graphs [12].

For many classes of compounds, the variation of molecular metric (bond lengths, valence
and torsional angles) and electronic structure are small (e.g., planar, homo-nuclear
systems). Provided the impact of th:!se factors on many of the molecular properties can
be neglected, the latter may be considered as only topology conditioned. Still, some
properties of such compounds are topology-invariant and are strongly conditioned by the
non-topological structural characteristics. For example, the tendency to delocalized 1t-
electron density within symmetric hexagonal a-framework is primarily due to the steric
constraint. These facts are supported by the assumption of a relative orthogonality of
topological and non-topological structural parameters with respect to the molecular
properties of the compounds considered.

The other fundamental principle conditioning the topology/reactivity relationships is the


Analogy concept. According to the Hammett formulation [13] underlying the Linear Free
Energy Relationship (LFER), "like substances react similarly and that similar changes in
structure produce similar changes in reactivity." The "structure" in Hammett's definition
of the Analogy principle can be replaced by "topology" for the compounds and properties
where steric and electronic factors play a subordinate role as compared to molecular
topology (e.g., series of compounds with no heteroatoms, as well as heterosystems with
constant position of heteroatom in the reference structure). Based on the Analogy
principle within a group of structurally similar compounds, one can infer quantitatively
the property of a particular compound from those of any other member. For this purpose
the so-called "Quantitative Analogy Models" were introduced [14]. Particular types of
these models are LFERs, where "free energy" implies either activation energy, ~#, or
energy of reaction,~. For example, LFERs for reaction equilibrium constant:

K = exp[-(~ - T~S)] (1)

can be derived for a reaction series of structurally similar compounds (assuming


practically no change of entropy of reaction, ~S. At a constant temperature:

(2)

where K[ and K2 are the reaction constants for two members of the reaction series, while
~[ and ~2 are the respective reaction energies.

Similarly [15,16], one can derive LFERs for reaction rate constants:
TOPOLOGICAL INDICES AND CHEMICAL REACTIVITY 223

(3)

where kl and k2 stand for the rates constants of two members of the reaction series, while
LiE l' and LiEz' are the respective activation energies.

Apparently, in the Hammett equation:

(4)
the difference of activation (reaction) energies of substituted (X) and reference structure
(H) of the series under investigation is proportional to the variation of electronic structure
of the reference molecule in a reference reaction series after introduction of substituent
X, as described by ax-parameter. The proportionality constant (reaction constant), Q,
reflects the specificity of the reaction studied as well as conditions of the reaction.

The above cases can be generalized by the following equation:

(5)

where Rl and R2 stand for the value of reactivity property of two members of the reaction
series (as reaction rates or equilibrium constants). Their ratio can be modeled as a
product of two relatively independent variables: the external parameter, EP, conditioned
by non-structural factor (as reaction conditions and specificity, etc.) and difference of the
respective structural parameter, SP. Usually in LFERs the impact of the structural
variation on reactivity is analyzed at constant values of the external factors (EP=const).
If one considers reactivity in a broader sense, including primary effects determined by
polar-polar or hydrogen bonding interactions, it is possible to relate the change of the
structural parameters with the variation of molecular properties as partition coefficient,
retention data, etc.

For properties determined predominantly by molecular topology, the above equation can
be written in the following form:

(6)
where TIl and TI2 are quantitative indices characterizing the topological structures of the
two members of the reaction series under consideration. The nature of the topological
indices will be discussed in detail on the next part of the work.

3. Molecular Topology and Topological Invariants

The topological indices are numerical quantities derived from molecular graphs
representing molecules. Such graphs could be hydrogen-filled or hydrogen-suppressed.
Sometimes weighted graphs, multi graphs or weighted pseudographs are used to represent
the relevant aspects of the chemical species [9]. First, the graph is transformed into a
more convenient mathematical representation. As such, one can use the adjacency and
distance matrices, characteristic and distance polynomials, etc. These mathematical
structures are then transformed by different algorithms in order to derive topological
indices (TIs), incorporating in a concise way the topological information of the respective
chemical species (see Fig.i).
224 O. MEKENYAN AND S. C. BASAK

AlGORITHM

MATHEMATICAL TOPOlOGiCAl
MOlECUlE ~ GRAPH I- REPRESENTATlON INDICES

Fig.1. General scheme of deriving topological indices

The algorithms transforming the mathematical representations of molecular graphs into


topological indices can be divided conventionally into three groups: simple, combinatorial,
and complicated. Into the first group one can include algorithms performing simple
functions on matrix elements or polynomial coefficients such as counting, summing,
multiplying, squaring, etc. The second group includes algorithms performing, additionally,
a combinatorial analysis over the elements of graph representations. The algorithms of
the third group are based on complicated transformations (diagonalization) of the graph
matrix representations.

Next we are presenting the topological indices most frequently used in structure-property
analysis, which can be obtained by the above three groups of algorithms (see ref. 9,17-20
for more details):

3.l. TOPOLOGICAL INDICES OBTAINED BY THE FIRST GROUP


ALGORITHMS

Neighborhood Relationships. The set of the vertices in the chemical graph can be
classified according to their degrees as well as the degrees of their first neighbors. This
classification appears to be useful for reactivity purposes [21].

Total Adjacency, A, is the sum of the matrix elements, ll;j, of the graph adjacency NxN
matrix [22,23]:

(7)

Zagreb Group Indices [24,25] are also obtained by simple function over adjacency matrix
elements:

MI =~Vj2; M2 =Ealledges(Vj,Vj) (8)

where Vj = ~~j'
Randic Connectivity Index [26]:

X =Ean edges(Vjv/ l12 (9)

In the Generalized Connectivity Index [27,28], the summation is extended over all
possible paths of length h:
TOPOLOGICAL INDICES AND CHEMICAL REACTIVITY 225

(10)

whereas in Valence Connectivity Index [29], the vertex degree Vi is replaced by the
number of valence electrons of atom i diminished by the number of hydrogen atoms
attached to this atom, ~.

The Wiener Index is defined [30] as the half-sum of the off-diagonal elements of the
distance matrix:
(11)

Another one index defined by the elements of the distance matrix is the Mean Square
Distance Topological Index [31]:

(12)

A Randic-type formula (eq.9) was applied [31,32] to distance sums, VO,i' instead of vertex
degrees, thus introducing the Average Distance Sum Connectivity Index, J:
(13)

where, the distance sum is defined as the sum of all entries of the i-th row in the distance
matrix [33]:

vo·,I =E·_INd
J- ..
I)
(14)

The normalization term in eq. 13 is based on the cyclomatic number, 11, and number of
edges, q (q = N + 11 -I, for planar graphs).
Analogously to VO,i' another useful topological index is defined [34-36] by summing self-
returning walks of length I, SRW/, starting from point i and passing through other
vertices, k, without traversing one and the same bond twice in each step:
(15)

Apparently, the total number of self-returning walks of length 2 is twice the number of
edges (single bonds) in the graphs, also termed the total adjacency, A:
(16)

The atomic topological indices, SRW/, were normalized by dividing by the total number
of such walks in the molecule [35,36]:

f/ =SRW/tE,SRW/ =SRW/ISRW I (17)

With the increase in I, the ft values converge to a certain limit:


(18)

Other atomic topological indices can be derived by some of the atom orderings obtained
by molecular coding algorithms, as Morgan extended connectivity (EC) algorithm [37],
Hierarchically Ordered extended Connectivity algorithm (HOC) [38]. The Extended
226 O. MEKENY AN AND S. C. BASAK

Connectivity index, EC, is obtained by summing the connectivities (i.e.SRW 2 ) of the


nearest neighboring atoms and this is repeated iteratively until a constant atom' ordering
is obtained in two consecutive steps (Fig. 2). Alternatively, the HOC procedure does the
same in a hierarchical manner.

9
MO:~ 9OC
j:/'5,,2 9 9 -tX
2 2 5 4 11 2 3

~CC1
2 3 1
_4CL2
16
4

4
4

2 2 11

-:OC --tX::
2 3 5 2 4

~OC4
3 1 4
5

2 3 (2+3) 5
(1+4)

Fig.2. Illustration of the atomic ordering as produced by the Morgan


Extended Connectivity (EC) and Balaban, Mekenyan and Bonchev
Hierarchical Extended Connectivity (HOC)

The Normalized Extended Connectivity index, NEC, is introduced to avoid difficulties


with the different limits of convergence for the different atoms:

NECj = lim1»1(ECitE.ECi)EEcj1 (19)

where LjEC/ = A = N.

Recently Kier and Hall [39] have introduced the electrotopological state, ESTj, which is
calculated from the intrinsic state volume of atom i, Ij, and the sum of loge state values
(lrI)lr vectored from atom i:

(20)

Here, Ii' is the intrinsic state value for every other atom and Si is the topological distance
within the loge in which i and j are terminal atoms. Ij is dermed by:

(21)

where 0v.j is the number valence electron of atom i and OJ is its vertex degree.

The topological indices based on information theory [40,41] can also be classified as
indices obtained by using the first group algorithm. For a system having N elements
distributed into k classes of equivalence N l' N2, ... ,N k a probability distribution
P{Pl,P2, ... ,pd is constructed (pj = N/N). The entropy of this distribution, calculated by
the Shannon's formula [40]:

(22)

is called information content. The approach can be applied to the entries of graph
representations, thus obtaining the information content of the structure, called also
TOPOLOGICAL INDICES AND CHEMICAL REACTIVITY 227

Information Topological Indices [18]. If one uses the entries, di, of the topological
distance matrix, the following Information Index based on Graph Distances can be
obtained [42]:

(23)

Analogously, Vertex Distance Complexity Index, Vd is introduced [43], by the equation:

(24)

Based on the distribution of the graph vertices according to the number of their flrst,
second, etc., neighbors the Information index on Neighborhood Symmetry was introduced
[44-46]. An appropriate set A of n elements is derived from a molecular graph G
depending on certain preselected criteria. On the basis of an equivalence relation deflned
on A, the set A is partitioned into equivalence classes Ai of order ni (i = 1,2, - - - - -, h
k ni = n). A probability scheme is then assigned to the set of equivalence classes:
I

AI' A2 , - - - - -, Ah
PI' P2' - - - - -, Ph
where Pi = nin, ni and n being the cardinalities of Ai and A respectively. The mean
information content (or complexity) of an element of A is deflned by Shannon's [40]
relation:

(25)

The logarithm is taken at base 2 for measuring the information content in bits. The total
complexity of the set A is then n times Ie.

It is to be noted that the complexity of a real object or a model object is not uniquely
deflned. While there could be more than one way of deflning a model object
corresponding to the same chemical species, complexity of the same model object
(chemical graph) may vary depending on the nature of the equivalence relation. In the
calculation of indices of neighborhood symmetry, two vertices u and v of a graph G are
said to be topologically equivalent if and only if for each neighboring vertex U i
(i = 1,2,- - - -, k) of the vertex u there is a distinct neighboring vertex Vi of the same
degree for the vertex v. If v is a vertex of the graph G, then the open r-sphere S(v,r) is
deflned as the subset of V(G) consisting of all vertices Vi such that d(v,v i) < r. Obviously,
S(v,O) = $, S(v,r) = v for 0 < r < 1, and S(v,r) = (v) uri (v) = NI(V) for 0 < r < 2. One
can construct open r-spheres of each vertex of G for all integral values of r, 0 ::,; r ::,; p.
For a particular value of r the collection of all such open spheres S(v,r), where v runs
over the entire vertex set V, forms a neighborhood system of the vertices of G. A
suitably deflned equivalence relation can then partition V into disjoint subsets based on
the equivalence of nature, connectedness, and bonding pattern of neighbors up to rib order
neighborhoods. It is noteworthy that this approach incorporates the effects of distant
neighbors (i.e. neighbors of immediately bonded neighbors) on an atom or a reaction
center. After partitioning of the vertices for a particular order (r) of neighborhood, IC,
is calculated by Shannon's formula. Subsequently, Basak, Roy and Ghosh [44] deflned
another information-theoretic measure, structural information content (SIC,), which is
calculated as:

(26)
228 O. MEKENY AN AND S. C. BASAK

where IC, is calculated as above and n is the total number of vertices of the graph.
Another information-theoretic invariant, complementary information content (CIC), was
defined as [45]:

CIC, = log2n - IC, (27)

Recently, an information-topological index, E, called electropy was introduced [47,48]


based on the assumption that the molecule forms a finite space r which is divided into
several partial bond spaces according to the electron pairings in the molecule. The
electropy is viewed as a measure of the degree of freedom of choice for electrons in
occupying different spaces in r during the molecular formation. The following equation
from information theory (applied in cases of equal probabilities of the possible events
before, Po, and after, PI (PI=I), formation of the molecule) is used for the calculation of
E:

(28)

Here, one can consider Po as the total number of possible ways of distributing N particles
into k partial bond spaces with N j particles in the partial space i.

3.2. TOPOLOGICAL INDICES OBTAINED BY THE SECOND GROUP


ALGORITHMS

A good example for a topological index derived by a combinatorial algorithm is Hosoya


index, Z [49]:

Z = L.,,,,,,[N121p(G,k) (29)

where p(G,k) is the number of ways in which k edges are chosen from the graph G so
that no two of them are adjacent; N/2 in the Gauss square brackets is the smallest integer
not exceeding the real number in them. By definition, p(G,O) = 1, while p(G,I) equals
the number q of edges in the graph. For acyclic graphs Z can be defined as the sum of
the absolute values of coefficients in the characteristic polynomial, P(G,x).

Herndon's structure count ratio [50] can be also considered as derived by a combinatorial
algorithm:

(30)

Here, SC R is the number of Kekule structures of the unperturbed molecule and SP p(o, 1) of
the transformation product, P, or rate controlling intermediate, I. It was found that InSC
values are proportional to resonance energy [51] and eventually to the stability of the
systems. A quicker way to obtain SC's is by summing of the absolute values of the
unnormalized non-bonded molecular orbitals (NBMO) coefficients, coj ' of the altemant ion
or radical [52,53]. The latter can be determined by means of the zero-sum rule of
Longuet-Higgins [54]:

(31)

where the summation is over all vertices j joined to the vertex k, Ajk are the corresponding
non-zero entries of the i-the row of the adjacency matrix.
TOPOLOGICAL INDICES AND CHEMICAL REACTIVITY 229

The first step of the procedure for counting of SC is to produce an odd alternant from the
even alternant by deleting one carbon atom and adjacent bonds from the even system.
Then the vertices of the odd alternant system are divided into two set (starred and
unstarred) in such a way that no vertices from one and the same set are adjacent. The
one set of atoms (starred) have zero coefficients in the NBMO. To the vertices of the
other set one assigns integers chosen in such a way that their sum around each starred
vertex is zero [54]. In fact, these simple integers are the unnormalized coefficients of
NBMO (Fig.3).

0 C°C
-2 !• )3.1'._,
1
, .~.

1 1 -I 1
° 0 0

Fig.3. The Coefficients of the unnormalized NBMO of an odd alternant


system obtained by the Longuett-Higgins "zero rule"

A method for calculating SCR's of non-alternant systems (fluoranthens) has also been
published [55].

Analogously, based on the coefficients of NBMO a topological index, N;, was introduced
[52] to assess the relative reactivity of the different position of an altemant hydrocarbon,
termed localization energy or reactivity number. The reactivity of a hydrocarbon at a
particular position is determined by a procedure similar to this one for deriving SCR. The
atom, reactivity of which is determined, is removed from the system with its adjacent
bond. The NBMO coefficients are determined for the resulting odd alternant system by
Longuet-Higgins approach. The absolute sum of the coefficients of the atoms neighboring
the removed one is obtained. Then this sum is normalized by the root of the sum of
squares of the unnormalized coefficients, thus producing reactivity number, N j • Thus, for
the ~-position of anthracene analyzed in Fig.4, the respective value of N~ is calculated
by the relationship:

N~ = 2(lx~ + 3x~)/...,t18 = 1.886 (32)

Corrections are introduced here [56] for the resonance integral, ~, according to the
branching conditions:

for bonds liking two vertices of degree 2;


for bonds liking vertices of degree 2 and 3;
for bonds liking two vertices of degree 3.

3.3. Topological Indices Obtained by Third Group Algorithms

After placing x in the main diagonal of the adjacency matrix, the latter is transformed into
the well known characteristic or topological matrix from Huckel quantum-chemical
theory. The respective characteristic polynomial may be obtained readily by expanding
the determinant of the topological matrix. Thus, the eigenvalues of the topological
(Huckel) matrix obtained after its diagonalization coincide to the graph spectrum [9].

Lovasz and Pelikan [57] introduce as a topological index the largest eigenvalue, Xl' ofthe
230 O. MEKENY AN AND S. C. BASAK

characteristic matrix.

Taking into account the topological nature of the Huckel quantum-chemical approach,
first-order perturbation theory, the free electron MO model, and valence-bond structure
resonance theory, one can classify the reactivity indices obtained by these methods as
purely topological-defined by the third type of algorithms from topological matrix. These
are: the atomic charges, q1t, the index of free valence, Fr, atomic self-polarizability, 1trr ,
superdelocalizability indices Sr' Brown's index Z, localization energy, Lr. Space does not
permit a consideration of all these parameters, which are described in detail elsewhere
[58].

In order to characterize the similarity of a given aromatic fragment L (with NL electrons)


in a molecule to the same isolated reference in terms of the corresponding (Huckel)
density matrices P and P L , a topological similarity measure, rL, was introduced by
Polansky, Fratev, et a1.[59, 60]:

(33)

Next, this idea was generalized by Carbo and Jenkins [61,62] and Ponec [63-65]. The
latter introduced a topological similarity index assessing the extent of reorganization of
electron density of a molecule during the chemical transformation, r AB • In this topological
approximation a similar expression is used as eq. 33, but here P and P L are the density
matrices of the reactant, A, and product, B, related by the equation:

(34)

The 't-matrix is the so-called assigning table [66] describing the mutual relation of basis
sets X and X' (HMO type delocalized 1t-molecular orbitals) of the reactant and product,
respectively:

X' = 'tX (35)

The 't-matrices are well approximated by diagonal matrices, where 't jj = ±1 describe the
changes of MO at particular atoms at the course of the reaction.

3.4. Global, Fragment and Atomic Topological Indices

Most of the above described indices assess the topology of the whole molecule. Because
of that, they are termed global topological indices. The latter are significant for structure-
reactivity studies either when the molecules interact as a whole (e.g., polar-polar
interactions with solvent molecules) or when molecular geometry controls the substrate-
receptor complex formation. Extensive experimental results, however, indicate that
chemical reactions usually concern a localized position in the interacting species. For
these cases it is desirable to characterize topologically a fragment of a molecule instead
of whole molecule. In order to define the fragment topological indices, one can apply to
fragments the same general procedures used for whole molecules. The obtained indices
were termed [67] internal fragment topological Indices, IFfI(F). But in this way one
cannot differentiate between isomeric univalent groups with different point of attachment
as for example n-butyl and s-butyl moieties. This problem was solved by taking into
account "the interaction" between fragments and the reminder of the molecule. Thus
resulting indices were termed [67] external topological indices, EFfI(F). More precisely,
they are specified as the difference in value between the topological index for the whole
TOPOLOGICAL INDICES AND CHEMICAL REACTIVITY 231

graph, TI(G), and the internal fragment indices for both the fragment, IFfI(F), and the
reminder of the molecule, IFfI(G-F)k:

EFfI(F) = TI(G) - [IFfI(F) + ~IFfI(G-F)k] (36)

The idea of external fragment indices will be illustrated by the Wiener index derived from
the distance (pxp symmetrical) matrix of the molecular graph. If the fragment F has p'
vertices, the IFfI(F) is defined by operations on the submatrix F, while the IffI (G-F)
is similarly specified on the submatrix (G-F) having pxp' vertices. The EFfI-indices are
defined by operations on the hatched portions of the matrix (Fig.4a). When (G-F)
comprises two or more disjoint subgraphs, the interaction between these subgraphs (the
additional hatched portions in Fig.4b) is not taken into account in specifying IFfI(G-F),
since they are connected only by virtue of the fragment F.

G~ 1 If

P
iii b

Fig.4. A scheme for a topological matrix of a graph G with p vertices, from


which a fragment F with p' vertices is selected: (a) connected
remainder of the molecule; (b) disconnected remainder

Fragment topological indices can be calculated based on the different graph invariant by
applying the above scheme [67].

When considering a fragment of one non-hydrogen atom, the EFfI(l) value reduces to:
EFfI(1) = TI(G) - IFfI(G') - a (37)

where G' is the vertex-excised graph, i.e. the initial graph from which the given vertex
and its adjacent edges have been removed. a stands for IFfI(F) and is zero in the majority
of cases or is a constant (e.g., a=l for the Hosoya index Z).

For example, based on eq.36 one can derive distance sum index, VD.i ' (see eq.14) if
proceeds from Wiener index. Though this is a general expression for the atomic
topological indices, there are many other original algorithms, as one can see in the
preceding section.
To the group of atomic indices one should include also Herndon's structure count ratio
(eq.30), Dewar's reactivity number (eq.32) and the whole group of HMO-reactivity
parameters.
232 O. MEKENYAN AND S. C. BASAK

4. Applications of Topological Indices to Chemical Reactivity

The section treating applicability of topological parameters to chemical reactivity problems


should be divided into two different subsections handling applicability of the global and
local (fragment and atomic) indices.

4.1. CHEMICAL REACTIVITY AND GLOBAL TOPOLOGICAL INDICES

Topological geometry of molecules should condition either the week polar-polar


interactions or the geometric requirements for substrate-receptor complex formation in
which the whole molecule takes part. The first type of interactions, however, condition
such primary effects as substrate partitioning between phases of different polarity
(logPOClanoVwater), soil sorption (K.,m)' association coefficient (~), etc. These primary effects
and/or geometric requirements for fitting to receptor cavity (thUS facilitating either
hydrophobic or electronic interactions) next can determine secondary (biological) effects
as acute toxicity, carcinogenicity, etc. In the forthcoming we are presenting examples for
successful models between global topological parameters and primary and secondary
reactivity effects. The hydrophobicity (LogP, octanol-water) of different sets of molecules
can be predicted from their TIs very effectively [68, 69]:

Log P = -3.127 - l.644(ICo) + 2.120eXc) - 2.914(6Xrn) + 4.208eXV)


+ 1.060(4Xv) - 1.020(4X I'd
V (38)
n = 137, R2 = 0.97, s = 0.26

Log P = - 1.48 + 0.95OX (39)


r2 = 0.97 n = 138 s = 0.152
The soil sorption was found to correlate well with the first-order molecular connectivity
index for a set of 72 heterocyclic, polycyclic (substituted) aromatic hydrocarbons,
chlorinated or brominated alkanes, alkenes and phenols [70]:

10gK.,m = 0.53 IX + 0.43 (40)


n=72; r=O.977; s=0.282; F = 1478

The associated coefficients, ~, of chemicals is related to their association with dissolved


organic matter (as humic substances) which is an important factor for their environmental
fate. Recently, the association coefficients, ~ of a series of polychlorinated biphenyls
(PCBs) has been successfully correlated nonlinearly with IX-index [70]:

log~ = -21.42 + 5.30 1X - 0.25eX)2 (41)


n=26; r=O.974; s=O.114; F = 213
Topological indices were found to correlate also with biological data. In general,
structure-activity models include parameters which are responsible for both main steps of
the biological process: the penetration and the stereoelectronic interaction. In these cases
the topological indices can be incorporated in the models because of their relationship
either with the penetration parameter (reflecting polar-polar interactions with the biological
phases) or with stereoelectronic one (reflecting substrate-receptor geometric
correspondence). For example, QSARs for the acute toxicity, log(lILC50), of a series of
acetylenic and allylic alcohols was found [71] to be modelled by 10gP and soft
electrophilicity parameter (acceptor superdelocalizability, SIN of the unsaturated carbon
atom):
TOPOLOGICAL INDICES AND CHEMICAL REACTIVITY 233

log(l/LC50) = -28.59(±3.37)+O.53(±O.09)logP + 117.1(±11.4)SIN (42)


n=20, a=0.02, r=0.923, s2=0.179, F=48.93

A set of topological indices, however, describe significantly the variation of 10gP for the
studied compounds. Thus, the correlation with IX' can be presented by the equation:

logP = -1.489(±O.294)+0.983(±O.108)IX' (43)


n=20, a=0.01, r=O.907, s2=0.281, F=83.32

As a consequence of this, IX'-index can successfully substitute the hydrophobicity


parameter in the toxicity model:

log(I/LC50) = -31.48(±3.31)+O.61(±O.09)IX' +1l3.1(±1O.9)St (44)


n=20, a=0.01, r=O.934, s2=0.155, F=57.60

Hall and Kier (72) found good correlation between aquatic toxicity (LC50) values in
fathead minnow and the third order valence connectivity index:

-Log LC50 = 1.079 3X' + 2.52 (45)


n = 25, r = 0.903, s = 0.35, F=lOI

4.2. CHEMICAL REACTIVITY AND LOCAL TOPOLOGICAL INDICES

The topological indices, fi and liml»lfi are considered [35,36] as fractional atomic
charges, describing distribution of one electron over the atoms in the molecules. This
assumption is based on the idea that each self-returning walk can be associated with
possible electron movements. The larger the number of SRWi for a specific atom, the
larger its fractional electronic charge. By the examination of a number of 1t-electronic
molecules it was found also [35] that f; = liml»lfi is equal to the partial Huckel LOMO
charges:
(46)

By this reason, the product of fi and N is called also topological charge, TC.

The fact that the indices fi' TC, (N)EC, ETS (see eqs.19 and 20), are related to atomic
topology only explains their correlation to CNDO/2 atomic charges in alkanes [36]:

q; = 74.28(±1.71) - 39.25(±1.00)EST (47)


n=33; r=0.990; s=O.003; F = 1551

The correlations with TC and NEC are characterized by r=0.952, s=O.007, and F=298 and
301, respectively.

For a molecule with a discrete spectrum of energy levels, E I, E2'''''~' the n-the moment
of energy is specified by:
(48)

where the second equality follows from the invariance of the trace of the corresponding
Hamiltonian matrix. The latter has a simple topological interpretation: it equals the
weighted sum of all self-returning walks of length I in the molecule, beginning and ending
234 O. MEKENYAN AND S. C. BASAK

with the same orbital (atom). This comes to explain the above relationship between the
indices based on self-returning walks and atomic charges as well as the relationships with
the other energetic molecular and atomic characteristics [73-76].

Recently, using the moment analysis, a scheme is proposed [77] for determining the
energy and reactivity of conjugated hydrocarbons without referring to the standard
calculations of HMO theory. The 1t-electron energy of a molecule here is expressed by
the equation:

(49)

t.~~re the a.-are tabulated coefficients determined by the truncated expansion of function

In order to assess the reactivity at a specific site in the molecule a parameter is


introduced, termed point energy, which in fact is the SRWi21 index, defined by eq. 15.
Based on this parameter, reactivity rules have been introduced, describing the variation
of SRWi 21 and respectively reactivity, at systematic changes of molecular structure. For
example, it was found that reactivity of atoms with equal valency is proportional to the
valency of their adjacent atoms.

Fig.S. Ordering the atoms according to their "site reactivity"

Based on the presented unified energy scheme [77], energy contributions are assigned to
different fragments (point-energy, edge-energy, ring resonance energy) which are used for
rationalizing the aromaticity, reactivity and bond length of conjugated hydrocarbons.

Herndon's structure count ratio, SCRi , as well as Dewar's reactivity number, Ni , are
constructed similarly and can be used to compare reactivity at different positions of a
molecule (as well as at a specific position in a reaction series). Classic examples here are
the naphthalene and biphenylene:
TOPOLOGICAL INDICES AND CHEMICAL REACTIVITY 235

0 ·1 ·1 0 ·1
0C()P 1
. 1. 0 1 .1 01 1 .1 0 0 o=t)P °ao
Q
·2 Q

'1cr~r 0 . ''9

·1 0
3
o 2 ·2
2 o • 3
8C,.=3 SC~=3 8C,.=S SCa =3
8<1=7 SCj =6 8<1= 11 SCj =8
SCR..=2.33 SCR, = 2.00 SCR,.=2.20 S~=2.67

N =~=1.81~ N, = 2(~)f' = 2.12li N = 2(2+3)~


V2S
I: 2.00jl N, = 2(1+2)~ =1.73~
CI vll II v12

Fig.6. Calculated structure count ratio, SCRI , and Dewar's reactivity number,
Nil for a and ~.positions of naptalene and biphenilene

As one can see from Fig.6, according to structure count ratio and reactivity number
(localization energy), the reactivity of naphthalene a-position is higher the this one at ~.
position: SCRa>SCRll.. and Na<N~. Alternatively, for biphenylene, ~-position is more
reactive than a: SCR~>SCRa and N~<Na. Both predictions correspond to the
experimental observations for local reactivity of these molecules.

A qualitative evaluation of the convulsant-anticonvulsant activity of barbiturates and the


carcinogenic activity of the polycyclic aromatic hydrocarbons was recently done by using
the vertex distance complexity index and its normalized version [43]. The index values
are computed for the atoms of each compound in the series. Then the computed values
are arranged in a decreasing order, and ranges of values contributed by active (active
range) and inactive (inactive ranges) molecules are specified. It is assumed that these
ranges contain those vertices whose topological environment distinguish active from
inactive compounds. Based on the occurrence of the index values of their atoms within
the active region, the accuracy of prediction of molecules' activity is specified.

Next we are going to demonstrate the application of the topological similarity index
introduced by Polansky, Fratev [60,61] and next generalized by Carbo [62] and Ponec
[64-66] in predicting pericyclic reactions' pathway [64]. The electrocyclic transformation
of 1,3-butadiene to cyclobutene is considered. Both molecules can be described by their
bonding MO, <PA and <PB. The first one is constructed by Huckel1t-MO whereas the
second one by localized bonds:
(SO)

The corresponding density matrices PA and PB then can be described:

1 0.894 0 -0.447 1 0 0 1

PA 0.894 I 0.447 0 PB 0 1 0 (51)


0 0.447 1 0.894 0 1 1 0
-0.447 0 0.894 1 0 0

The next step consists of transforming PB from the basis of atomic orbitals x: into the
basis X (by eq. 33) serving simultaneously to the description of PA • Two such matrices
236 O. MEKENYAN AND S. C. BASAK

can be constructed corresponding to the allowed conrotatory and forbidden disrotatory


cyclization:
(52)
1 1

-1

The calculated similarity indices by these matrices for both pathways are: rcon=0.723 and
rdi.=O.500 or in other words the similarity of the reaction partners is larger for the
conrotatory than for the disrotatory cyclization. On the other hand, according to least-
motion principle, the easy course of the reaction is connected with the requirement of
minimal variation of electronic configurations of the reacting molecules. Hence,
conclusion could be made by the above calculated r-values that more likely reaction is the
conrotatory cyclization, which correspond to Woodward-Hoffmann rules as well as to
experimental results.

Despite of some limitations the topological similarity approach is promising for the
formulation of selection rules in chemical reactivity.

Space does not permit a discussion on application of HMO reactivity indices which
undoubtedly did a large impact on calculating and predicting chemical properties [59].
We refered here only to some more recent applications of these topology based reactivity
parameters. During the past decade, for example, they were applied to the identification
of the molecular fragment of polycyclic aromatic hydrocarbons (PAH), most susceptible
to carcinogenic metabolism. For this purpose, Jerina and Lehr [78] have calculated the
ease of carbonium ion formation from various dihydrodiol epoxides by estimating the
respective increase in HMO-delocalization energy. Seybold and Smith [79] have used the
net 1t-electron charge at the benzyl carbon of an ionized Bay-Regions [80] of the studied
dihydrodiol epoxides. Soriano et al. [81] have assessed the sums of two atomic (HMO)
superdelocalizabilities for three particular regions of PAH, termed A, K, and L.

5. Conclusions

Here, we have introduced two different classifications of TIs. The first one is based on
which aspects of the chemical graph they quantify. The global, fragment, and atomic
graph invariants represent different geometrical information regarding the portion of the
molecular graph they characterize. Thus they are capable of modeling distinct types of
reactivities. The second classification is based on the specificity of algorithms translating
the mathematical representation of molecular graphs into topological quantities. The first
two groups of algorithms, using simple mathematical functions and combinatorial
procedures, produce the conventional TIs known from chemical graph theory. The TIs
obtained by applying the third group of algorithms (diagonalizing particular graph
matrices) are in fact the well known reactivity indices from Hueckel theory. Though
produced by different algorithms, however, the TIs from the three groups have a common
foundation, namely the chemical graph and its mathematical representation.

We suggest that in correlation analysis TIs should be used in conjunction with models of
the reactive process and plausible hypotheses about the mechanism of the reaction under
investigation. Such an approach will explain why a particular TI correlates with reactivity
in certain cases whereas fails to do so in others.
TOPOLOGICAL INDICES AND CHEMICAL REACTIVITY 237

TIs will have a predominant role in determining molecular reactivity where metric and
electronic aspects of molecular architecture plays a very minor role in the reaction
process. Thus an understanding of the mechanism of the particular reaction process under
investigation is critical to the selection of a proper set of TIs for modelling reactivity.
Initially, it is advisable to use TIs in modelling the primary effects. This will clarify why
a particular TI is able to predict some property reasonably well. Subsequently, these TIs
should be used in predicting secondary effects.

Finally, the initial set of TIs used in predictive models should not be too large to the point
that the probability of chance correlation will be high [82]. Rational choice of a minimal
set of TIs on the basis of understanding the reaction process is one solution to this
problem. On the other hand, the presence of too many TIs in the final models (even if
some statistical criteria permit that) can cause problems also, because the relation between
TIs and the reaction mechanism is lost.

Acknowledgment

One of the authors (SCB) was supported in part by cooperative agreement No. EPAICR
819621-01-0 from the United States Environmental Protection Agency. Contribution
Number 104 from the Center for Water and the Environment of the Natural Resoures
Research Institute.

6. References

1. G. Klopman, and R.F. Hudson, Theor. Chim. Acta, 8, 165 (1967).


2. G. Klopman, J. Am. Chern. Soc., 90, 223 (1968).
3. G. Klopman, in: G. Klopman (Ed.), Chemical Reactivity and Reaction Paths, John
Wiley & Sons, New York, 1974.
4. R.G. Woolley, J. Am. Chern. Soc., 100, 1073 (1978).
5. P. Claverie, S. Diner, Israel J. Chern., 19, 54 (1980).
6. M. Zander, Top. Curro Chern., 153, 101 (1990).
7. I. Gutman and N. Trinajstic, Top. Curro Chern., 42,49 (1973).
8. D.H. Rouvray, J. Chern. Educ., 52, 768 (1975).
9. N. Trinajstic, Chemical Graph Theory, CRC Press, Boca Raton, Florida, Volumes
I and II, 1983.
10. F. Harrary, Graph Theory, Addison-Wesley, Reading, Mass., 1971, p. 84.
11. D.H. Rouvray and A.T. Balaban, in: Applications of Graph Theory, R.J. Wilson
and L.W. Beineke, Eds., Academic Press, London, 1979, p. 177.
12. L. Spialter, J. Chern. Doc., 4, 261 (1964); ibid. 4, 269 (1964).
13. L.P. Hammett, Physical organic chemistry, McGraw-Hill, New York, 1940, p. 348.
14. S. Wold, M. Sjostrom, in: Correlation analysis in chemistry, Plenum, New York,
1978, Chapt 1.
15. M.J.S. Dewar, R.C. Dougherty, The PMO theory of organic chemistry, Plenum,
New York, 1975.
16. G.W. Klumpp, Reactivitat in der Organischen Chemie, Thieme, Stuttgart, 1978,
Vol. 2, pp. 367-369.
17. A. Balaban,I. Motoc, D. Bonchev and o. Mekenyan, Top. Curro Chern., 114, 21
(1984).
18. D. Bonchev, Information Theoretic Indices For Characterization of Chemical
Structure, Research Studies Press, Chichester, N.K., 1983.
19. A. Sabljic, N. Trinajstic, Acta Pharm Jugosl., 31, 189 (1981).
238 O. MEKENY AN AND S. C. BASAK

20. M.1. Stankevich, LV. Stankevich and N.e. Zefirov, Usp. Khim, 57, 337 (1988).
21. M. Zander, Naturwissenchaften, 69, 436 (1982).
22. D. Bonchev, O. Mekenyan, H. Fritsche, J. Cryst. Growth, 49, 90 (1980).
23. J. Barton, in: Sintering and Catalysis, G.e. Kuczynski, Ed., Plenum, New York,
1977, pp. 17-27.
24. I. Gutman, B. Ruscic, N. Trinajstic, and e.F. Wilcox, Jr., J. Chern. Phys. 62, 3339
( 1975).
25. I. Gutman and N. Trinajstic, Chern. Phys. Lett., 17, 535 (1972).
26. M. Randic, J. Am. Chern. Soc., 97, 6609 (1975).
27. L. Kier, L. Hall, W. Murray, and M. Randic, J. Pharm. Sci., 64, 1971 (1975).
28. L. Kier and L. Hall, J. Pharm. Sci., 65, 1226 (1976).
29. L. Kier and L. Hall, Molecular Connectivity in Chemistry and Drug Research,
Academic, New York, 1976.
30. H. Wiener, J. Am. Chern. Soc., 69, 17 (1947); 69, 2336 (1947); J. Chern. Phys.,
15, 766 (1947); J. Phys. Chern., 52, 425 (1948); 52, 1082 (1948).
31. A. T. Balaban, Pure Appl. Chern., 55, 199 (1983).
32. A. T. Balaban, Chern. Phys. Lett., 89, 399 (1982).
33. O. Polansky and D. Bonchev, MATCH, 21, 135 (1988).
34. D. Bonchev, O. Mekenyan and O.E. Polansky, in: Graph Theory and Topology in
Chemistry, R.B. King and D.H. Rouvray, Eds., Elsevier, Amsterdam, 1987, p. 126.
35. D. Bonchev, L.B. Kier, and O. Mekenyan, Int. J. Quant. Chern. (in press).
36. D. Bonchev and L.B. Kier, Journal of Mathematical Chemistry, 9, 75 (1992).
37. H.L. Morgan, J. Chern. Doc., 5, 107 (1965).
38. O. Mekenyan, A.T. Balaban and D. Bonchev, J. Magn. Res., 63, 1 (1985).
39. L.B. Lier and L.H. Hall, Pharm. Res., in press.
40. e. Shannon, W. Weaver, Mathematical Theory of Communication , Urbana, Univ.
Illinoy Press, 1949.
41. L. Brillouin, Science and Information Theory, New York, Academic, 1956.
42. D. Bonchev and N. Trinajstic, J. Chern. Phys., 67, 4517 (1977).
43. C. Raychaudhury, S. K. Ray, 1. J. Ghosh, A. B. Roy and S. e. Basak, J. Comput.
Chern., 5, 581 (1984).
44. S.C. Basak, A.B. Roy and J.J. Ghosh, Proceedings of the lInd International
Conference on mathematical modelling (Eds. X.J.R. Avula, R. Bellman, Y.L. Luke
and A.K. Rigler) pp. 851-856, University of Missouri-Rolla, Rolla, Missouri, USA
(1979).
45. S.e. Basak and V.R Magnuson, Arzneimittel-Forschungl Drug Research, 33,479
(1983).
46. A.B. Roy, S.e. Basak, D.K. Harriss and V.R. Magnuson, in Mathematical
Modelling in Science and Technology. X.J.R Avula, RE. Kalman, A.1. Lipais and
E.Y. Rodin (Eds.), p. 745, Pergamon Press, 1984.
47. W.Y. Yee, K. Sakamoto, Y.J. I'Haya, Rept. Univ. Electro-comm., 27, 53 (1976).
48. K. Sakamoto, W.Y. Yee, Y.J. I'Haya, Rept. Univ. Electro-cl)mm., 27, 227 (1977).
49. H. Hosoya, K. Kawasaki, and W.J. Murray, J. Pharm. Sci., 64, 1974 (1975).
50. W.e. Herndon, J. Org. Chern., 40, 3583 (1975).
51. W.e. Herndon, Israel J. Chern., 20, 270 (1980).
52. N.J.S. Dewar and R.e. Dougherty, The PMO Theory of Organic Chemistry,
Plenum Press, New York, 1975.
53. W.e. Herndon, Tetrahedron, 29, 3 (1973).
54. H.e. Longuet-Higgins, J. Chern. Phys., 18,265,275, 283 (1950).
55. W.e. Herndon, Tetrahedron, 39, 1389 (1982).
56. H. Kuhn, Helv. Chim. Acta, 31, 1441 (1948); Helv. Chim. Acta, 32, 2247 (1949).
57. L. Lovasz and J. Pelikan, Period. Math. Hung., 3, 175 (1973).
TOPOLOGICAL INDICES AND CHEMICAL REACTIVITY 239

58. A Streitwieser, Jr., Molecular Orbital Theory for Organic Chemists, Wiley, New
York, 1961.
59. F. Fratev, O.E. Polansky, A Melhorn, V. Monev, J. Mol. Struct., 56, 245 (1979).
60. O.E. Polansky, G. Derflinger, Int. J. Quant. Chern., 1, 379 (1967).
61. R Carbo, L. Leyda, M. Arnau, Int. J. Quant. Chern., 17, 1185 (1980).
62. P.E. Bowen-Jenkins, L.D. Cooper, W.G. Richards, J. Phys. Chern., 89, 2195
(1985).
63. R. Ponec, ColI. Czech. Chern. Commun., 52, 555 (1987).
64. R. Ponec, Z. phys. Chemie, 270, 365 (1989).
65. R Ponec, J. Phys. Org. Chern., 4, 701 (1991).
66. R Ponec, ColI. Czech. Chern. Commun., 29, 455 (1984).
67. O. Mekenyan, D. Boncher, and A.T. Balaban, J. Moth. Chern., ,2,347 (1988).
68. S. C. Basak, G. J. Niemi and G. D. Veith, J. Math. Chern., 4, 185 (1990).
69. W. J. Murray, L. H. Hall and L. B. Kier, J. Pharrn. Sci., 64, 1978 (1975).
70. A Sabljic, in: Practical Applications of Quantitative Structure-Activity
Relationships (QSAR) in Environmental Chemistry and Toxicology, W. Kaecher
and J. Devillers, Eds., Kiuwer, Dordrecht, 1990, pp. 61-82.
71. O. Mekenyan, G. Veith, S. Bradbury and C. Russom, Quant. Str.-Act. Relat. 12,
132 (1993).
72. L. H. Hall and L. B. Kier, Environ. Toxicol. Chern., 8, 19 (1989).
73. J.K. Burdett, S. Lee, and W.C. Sha, Croat. Chern. Acta, 57, 1193 (1984).
74. J.K. Burdett, and S. Lee, J. Am. Chern. Soc., 107, 3050, 3063 (1985).
75. J.K. Burdett, Struct. Bonding (Berlin), 65, 10 (1987).
76. J.K. Burdett, Chern. Reviews, 88, 1 (1988).
77. Y. Jiang and H. Zhang, Theor. Chirn. Acta, 75, 279 (1989).
78. RE. Lehr, D.M. Jerina, Terahedron Letters, 24, 27 (1983).
79. LA Smith, G.D. Berger, P.G. Seybold, M.P. Serve, Cancer Research, 1978, pp.
2968-2977.
80. R.E. Lehr, A.W. Wood, in: Polynuclear Aromatic Hydrocarbons: Physical and
Biological Chemistry, M. Cook, AJ. Dennis, G.L. Fisher, Eds., Batelle Press,
Columbus, Ohio, 1982, pp. 21-37.
81. D.S. Soriano, J.A. Daeger, D. Robbins, W. Confer, and V. Soriano, J. Environ.
Sci. Health, A25(3), 277 (1990).
82. T.G. Topliss and R.P. Edwards, J. Med. Chern., 22, 1238 (1979).
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION
MECHANISMS AND THEm ELEMENTARY STEPS

OLEGN. TEMKINl , ANDREYV. ZEIGARNIK\


AND DANAIL BONCHEV1
lLomonosov Institute of Fine Chemical Technology
Laboratory of Chemical Kinetics and Catalysis
Pr. Vernadskogo 86, Moscow 117571, Russia
lHigher Institute of Chemical Technology
Department of Physical Chemistry
Burgas 8010, Bulgaria

1. Introduction
Chemical reactivity, which can be viewed as the capability of chemical species of any
kind to undergo chemical transformations, has always been a key problem in theoretical
chemistry. Any progress in understanding reactivity not only enriches chemical
knowledge but also has important practical implications. Numerous methods have been
developed to assess reactivity quantitatively, and it is not the aim of this chapter to
review all of them. Yet, some basic approaches are to be mentioned. These include first
of all various quantum chemical concepts and results, such as the classical work of
Dewar and Simonetta,I,2 the frontier orbital theory of Fukui et al., 3-S orbital symmetry
principles,6 the isolobal concept of Hoffmann,7 and molecular hardness (softness) concept
of Parr and Pearson. 8-IO Correlation analysis has also contributed greatly to this areay-14
The experimental reactivity measure most commonly used is the rate constant of the
reaction in which the compound of interest is involved. Advances in contemporary
experimental techniques made possible the precise direct measurements of the rate
constants of chemical reactions, including elemen~ reactions of electronlS ,16 and
proton I7 ,18 transfer, and steps involving radicals,19 ions, 0 and metal complexes. 21 -24

This chapter centers on progress in the mechanistic and kinetic studies of complex
reactions as a source of information on the reactivity of intermediates and their
elementary reaction steps. Estimation of kinetic constants (known as the inverse problem
in chemical kinetics) is an area of intensive study. 2S However, the correct and exhaustive
formulation of this problem presumes progress in two related areas of research:
formulation of mechanistic hypotheses and their discrimination based on experimental
data. The necessity of these two preliminary stages of investigation matches the modern
view of scientific method,26,27 known as the hypothetic-deductive method. This method
is advocated in our recent work,28 in which we developed a rational methodology for
kinetic and mechanistic studies and showed that formulation of mechanistic hypotheses
is a key procedure in chemical mechanistic studies. Computer assistance in advancing
241
D. Bonchev and O. Mekenyan (eds.), Graph Theoretical Approaches to Chemical Reactivity, 241-275.
© 1994 Kluwer Academic Publishers.
242 O. N. TEMKIN ET AL.

hypotheses allows to precisely and exhaustively accomplish this task.

In view of this methodology, the mechanistic approach to chemical reactivity may be


presented as the following sequence:

Understanding and predicting chemical reactivity - Correlation analysis (including


QSAR) .. Kinetics of elementary reactions .. Overall kinetics of multistage
reaction .. Advancing mechanistic hypotheses and evaluating them

The choice of the experimental method for evaluating mechanistic hypotheses and the
structure of the overall kinetic law of a multistage reaction is largely influenced by the
topological structure of the mechanism. 29 This structure can be depicted in the form of
a graph. In many cases, the specific features of this graph (and the related topological
structure) produce specific kinetics of the overall reaction,30 and such studies in the
graph-theoretical modeling of reaction mechanisms may find important applications.
Therefore, this chapter will elucidate the progress in these areas of research.

2. Graph-Theoretical Approach to Studies in the Elementary Steps of Complex


Reactions

2.1. THE NOTION OF ELEMENTARY STEP

At present, numerous chemical reactions are widely believed to occur via several
consecutive elementary reactions. Such a set of elementary reactions is usually termed
the reaction mechanism. Mechanistic studies, which provide predictions of the behavior
of complex reactions in wide intervals of varied parameters, help shed light on the nature
of complex reactions and, more generally, on the self-organization phenomena. The
concept of a multistep mechanism for catalytic reactions is now commonly accepted.
Such deyelopments make elementary reactions and reaction mechanisms a topic of ever-
growing interest. However, there is no general definition of elementary reaction as a
basic unit of a complex reaction. The definitions proposed so far have limited
applicability, since they account for different aspects of this phenomenon. Therefore, in
this section, we will review briefly some of the previous attempts at defining an
"elementary reaction" and present our original concept.

There are two major areas of mechanistic studies, both aimed at developing the notion
of elementary step. The classical formal-kinetic approach, regards the elementary
reaction as a repeated reproduction of a unit act. This is supposed to be a transformation
of some (usually no more than two) reacting species, which results from their collision,
or a molecular rearrangement of single species. The elementary reaction is thus
characterized by a single transition state and no intermediates. It is also supposed to run
via concerted bond change and in a certain direction, Le., the inverse reaction is assumed
to be a separate process.

The formal-logical approach aims at an exhaustive description of the potential reaction


mechanisms and considers the real nature of an elementary reaction. Information on the
complex pattern of a reaction, regarded as necessary for designing experiments, as well
as for catalytic activity predictions, can be gained by generating of hypotheses concerning
mechanism. 28 Deductive solution of these problems became possible with computer
assistance.
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 243

The formal-logical approach, however, introduces severe restrictions on the types of bond
redistribution (bond making or breaking). There are only two common principles for
describing elementary reactions. These are the well-known classical principle of the
minimum structure change31 ,32 and the self-explaining principle of the minimum
reaction participants in elementary reactions, which is grounded in collision theory. 33

The first of these principles was formulated mathematically as the principle of the
minimum chemical distance (PM CD) . 34-37 Chemical distance is a quantitative measure
of chemical dissimilarity. For many reactions, chemical distance can be represented as
the mathematical model of the logical structure of constitutional chemistry, proposed by
Ugi and co-workers. 34 ,38 A central point in this model is the so-called reaction matrix,
which describes the pattern of electron and bond redistribution in the course of the
reaction. This theory rsrovides the mathematical ground for studies on reaction
mechanisms generation, 9-42 however, it does not suggest a mathematically rigorous
definition of elementary reactions. Thus, the problem of describing the possible types of
electron redistribution and bond change in elementary reactions remained open.

Several attempts to fill this ,§ap are worth mentioning. The contemporary formulation of
the valence bond method43 , provides a convenient basis for this purpose. This method
proceeds from the classical postulates for bicentered bonds, and assumes that reactions
occur via bond-breaking or bond-making. A critical review of these principles45
emphasizes the two possibilities for bond-breaking or bond-making: the heterolytic and
homolytic ones. However, the question of how many bonds are involved in the bond
change process remained open. The apgroach proposed by Zefirov and Trach45 -50 was
based on Woodward-Hoffmann rules6,51, for multicentered processes with cyclic electron
redistribution. However, the elementary nature of pericyclic reactions is debatable. In
another approach, Dewar3 discussed the problem of possible bond change in concerted
and synchronous processes, He termed any reaction involving one bond-breaking and/or
one bond-making a "one-bond reaction", assuming multibond reactions to proceed via
several bond-breaking and -making acts. Dewar also presented evidence for the
assumption that multi-bond processes cannot normally be synchronous. However, there
are exceptions to Dewar's rules, i.e., multibond processes allowed b~ Woodward-
Hoffmann rules or Evans' principle,S4-s6 as well as E2 and SN2 reactions. 3 It should be
noted that elementary reactions are not necessarily synchronous. However, they must be
concerted, i. e., they must proceed via single kinetic step without intermediates.
Synchronous nature requires bond change to proceed at the same time. Finally, an
approach developed by Koca et al. 37, assumes that an elementary reaction is nothing but
one bond-making (or breaking) that occurs heterolytically, Clearly, such a simplification
is chemically untrue.

One may conclude then that several central questions remain open:
(i) Which types of electron redistribution occur in elementary reactions?
(ii) How many bonds of each atom can be formed and/or broken in a concerted process?
(iii) How many bonds can be totally formed and/or broken in a concerted manner in a
single kinetic step?

In what follows, we summarize our approach to the solution of these problems. A


complete description of the mathematical formalism used and the results obtained is given
elsewhere. 57
244 o. N. TEMKIN ET AL.

2.2. TOPOLOGICAL IDENTIFIER - A GRAPH-THEORETICAL CONCEPT FOR


IDENTIFYING ELEMENTARY REACTIONS

2.2.1. Definitions

In graph theory, each molecule can be represented by a molecular (constitutional) graph.


The latter is composed of a set of vertices {V;} and edges {Et }, corresponding to the set
of atoms and bonds of the molecule, respectively. If there is a bond between atoms A
and B, the respective graph vertices VA and VB are said to be adjacent. The set of reacting
molecules will be called here a reaction system (RS) or, in the terms of Ugi,34 an
ensemble of molecules (EM). RS in the beginning and at the end of the reaction can also
be represented by graphs, which are not necessarily connected ones. A chemical equation
written so as to depict the reagents and products in the form of graph we call a graph-
scheme. 57 (Zefirov and Trach used for this purpose the term symbolic equation, making
use of some additional symbols to characterize electron redistribution. In our work these
details are omitted and the symbols for ions and radicals are not taken into account.)

After deleting all vertices and edges that correspond to atoms and bonds taking no part
in the bond redistribution process, one obtains the so-called simplest graph-scheme of
a reaction. The reaction systems in the beginning and at the end of the reaction in the
simplest graph-scheme we call initial graph Gin and final graph Gf , respectively. The
graph G'ff? which results from embedding Gfon Gin is called here the topology identifier
(Fig. 2.1).

/H
(a) M + H-C==C-r=C' + C l " -
H H

(b)

(e) ......... e---e •


Gin

(b) 0-0--0--0--0-o
Gtop
Fig. 2.1. The stages in building a topological identifier of a reaction system. a) chemical
equation; b) the graph-scheme containing the three reagents and the reaction product; c)
the simplest graph-scheme, describing only the bond changes by means of the initial and
final graphs, Gin and Gf ; d) the topology identifier built by superimposing Gin and Gf'
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 245

2.2.2. Topological Aspects of the Bond Change Character

It was shown recently46 that there are only two simple topologies of bond change, linear
and cyclic ones, whereas more complex topologies can be decomposed into these two
simple cases. Fig. 2.2 illustrates the simple topologies for 3-6-centered chemical
transformations.

It seems worthwhile to investigate the correspondence between the above-mentioned


topology classes and the nature of the bond changes in elementary reactions. This
problem as yet has no formal-logical solution, as discussed in section 2.1. However.; we
proposed heuristic rules that allow one to approach the problem mathematically.s An
extensive literature search of over 400 papers on catalysis by metal complexes and
organometallic chemistry uncovered about 3000 classes of different elementary reactions.
Only five types of reaction centers (or bond change types) were extracted from this vast
database for elementary reaction t~s. They are shown in Table 2.1. This finding is in
accordance with Tolman's rules,S and we suppose that it may be extended to organic
reactions.

Table 2.1. Possible types of reaction centers and their characteristics

No Reaction center types Change in Required vertex


vertex degree degree in Gin
1 Breaking of one bond 0 1
and making another one
2 Making of one bond 1 0
3 Making of two bonds 2 0
4 Breaking of one bond -1 1
5 Breaking of two bonds -2 2

In analyzing complex reaction topologies, Zeigarnik and Temkins7 showed that in these
cases the topology identifiers must not contain vertices whose degree exceeds two.
Hence, as seen in Table 1, no reaction center changes more than two bonds. Therefore,
the topology identifier for a graph-scheme of elementary reaction must be either a circuit
or a path (in Harary's terms).s This makes possible the enumeration of the elementary
reaction types, i. e., the enumeration of the simplest graph-schemes, corresponding to
elementary reactions.
....IV
0-

H H •
/
M+ c M c - . /•
N=3 I • 0<1
H
,H 1 ""'. 0
3.la 3.2a 3.3a

/ I
M=C , M-C-
+ c c
N=4 I I
- .--. - • •
-c c- C=C-
/ .--. 1 1 IJ
4.la 4.2a 4.3a

\ ,/
. . . c~c/ c---c/ .---• /• •
c •
N=5 M + I ~ II - • l
0..--0
\ c...-c ,
• ...-. .. '. <1
-c~c . . . . •
\ 1\
5.la 5.2a 5.3a

I I /
N=6
-c-c::c-
I c
- -c=c=c
I •/ • •\ ..
. -.
_. 1-0\
O-M~O O~M""'O \ / • 0\ 9
.-. • • 0-01 :z..,
6.la 6.2a 6.3a tn
3:::
;>::
Z
Fig. 2.2a. Bond change topologies for 3-6 centered chemical transformations: cyclic topology ..,tn
;,,-
r-
Cl
::a
• L + M-L' >
."
N=3 L-M+L' .~ .- . 0-0-0 ::r:
.-. .:.,
. ::r:
~
3.1 b 3.2 b 3.3 b ::a
~
>
r
3:
I I 0
t:1
M-C-
.. W=c- • •
_. 1!l
Cf.l
N=4 Sil
H-
w :,M, - I-
1 n
• 1- _. :J 0
3:
."
4.1 b 4.2 b 4.3 b
~
::a
ttl
>
'C~c .-.
I to 9
0
1--c • •
N=5 M + HI - .. M + II - • ........ lJ z
w,..c, 0 3:
• •I
we.... • • f!l
::r:
5.1 b 5.2 b 5.3 b ~
Cf.l
3:
Cf.l
,,/ 'C-C'/
/CJ -~ II -I \ cf°~b
N=6 M -;,c-C
.-. • - .• ••
I
+ 1- , Nu " .-. o-d
••
Nu 6.1 b
• 6.2 b 6.3 b

Fig. 2.2b. Bond change topologies for 3-6 centered chemical transfonmtions: acyclic topology
t->
~
248 O. N. TEMKIN ET AL.

2.2.3. Enumeration of Elementary Reaction Types

The graph-scheme enumeration problem reduces to the problem of enumerating edge-


colored graphs obtained by coloring the topology identifier Gtop. One of the colors
corresponds to bond-making, and the other to bond-breaking. The details of the
enumeration procedure, which include the determination of the automorphism group of
the Gtop graph, the figure counting series, and the algorithm for graph-scheme
constructing, are presented elsewhere. 57 (The reader interested in reaction enumeration
we direct to the recent publications of Fujita. 60•61 ) Here we present part of the results
obtained by us (Figs. 2.3-2.5).

/ • \
- - ~q=O

/ •
- ..
~
~q =1


• •
- ..
1\ ~q=2


- - ~ ~q =3

• •
Fig. 2.3. All possible types of elementary reactions involving three reaction centers . .:1q
stands for the total bond change (the difference in the number of bonds formed and
respectively broken upon the reaction).
Cl
::a
>
."

Aq=O Aq = 2
• -- i
~
n
>
I I~= I • :J r
3::
0
0
tIl
ren
Aq =0 0
r.~ • • Aq = 3 'Tl
n
0
:J • •
-- :J 3::
."
r
tIl
><
::a
tIl
Aq = 1 >
B
• • --
.......- Aq = 4 0
:l z
• • --
• • 0 3::
~
::t::
>
~
en
Aq = 1 3::
til
• -- ..........

I -- ---
Fig. 2.4. All possible types of elerrentary reactions involving rom reaction steps.

~
250 O. N. TEMKIN ET AL.

I/)
C""I ~
N N
II II
II II II 0"
0" 0"
0" 0" <l
<l

c c
<l <l
<l

\..-. (: (J ~
~
C)
c
·Bg

.
11 11 11 11 e:!
11 ~
• • • • I
/ •
~
· ·• I • • • • .~
• • • • • • !!l
.g
g
....
Q)

i~
0
II
0
II
0
II
-II
-0"
II

Q)
......
0
0"
0"
<l <l
0"
<l
0"
<l <l '"
Q)

/---.
• • (: /. .) 1...1
...s.I
P-
'"'"0
........
c..

• ~
I/)

N
.9P
~

11 11

<:
11 11 11

• • ......... "-.
.
~

~. •~ \ \..
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 251

2.3. SOME GENERALIZATIONS

Several rules have been used so far to identify elementary reactions:

(i) The principle of transition state (activation barrier) uniqueness. Although rigorous,
this rule cannot be well defined mathematically. Nothing can be said about the activation
barrier uniqueness proceeding from the bond change analysis.
(ii) The principle of minimum reaction participants: reaction molecularity should not
exceed three.

(iii) The principle of minimum structure change, and its modem version, the principle
of minimum chemical distance. Strong objections could be made to the applicability of
this principle to elementary reactions. Different mechanistic schemes may involve, with
the same probability, elementary reactions with arbitrary, but not very large, chemical
distances. The calculation of chemical distances says nothing about the probability or
improbability of a mechanistic scheme.

Our studies have led to the formulation of two more principles. They have not been
rigorously proved but are based on extensive observations in the area of catalysis by
metal complexes. We suppose that their validity might be extended to all elementary
reactions, by anticipating that a future quantum chemical evidence will support it. The
first one was already discussed above in Table 2.1 and Figs. 2.2-2.5. It can be
summarized as follows:

(iv) The principle of simple bond change topology: Elementary reactions obey either
linear or cyclic bond change topology.

The cases of non-simple bond change topology are readily detected by the presence of
vertices in GUlp whose degree is greater than two. Such vertices correspond to centers in
the activated complex, which break and/or form more than two bonds. For example, the
step of alkene epoxidation by metal alkyl peroxides62 should not be regarded as
concerted, owing to the presence of an oxygen atom, which breaks two bonds and forms
two new ones. The corresponding vertex in GlOP is of degree four.

The second new principle was deduced from the examination of the graph-scheme lists
(Figs. 2.2-2.5) and their comparison with reference data. Because no concerted reactions
with I~q I > 1 were found, we came to the conclusion that the factor determining the
lack of such reactions is the disbalance between bond making and bond breaking. Hence,
the following formulation of this principle resulted:

(v) The principle of bond change compensation: The number of breaking (making)
bonds, which are not compensated by making (breaking) bonds, should not exceed one.

As can be seen, our new rules are heuristic. Their mathematical or quantum chemical
justification is an open question and a challenge for theoretical chemists. We would also
like to mention that the lack of a rigorous mathematical definition does not prevent the
use of the concept for elementary reactions. The latter is a basis for specifying the
notions of reaction route and reaction mechanism which are the main topic of interest in
the next sections of this chapter.
252 O. N. TEMKIN ET AL.

3. Classification and Coding of Linear Reaction Mechanisms By Using Kinetic


Graphs

3.1. BACKGROUND

The mechanism of any complex chemical reaction is a set of reagents, products, and
intermediates that incorporates ordered subsets of species related to each other by
chemical reactions (mechanism steps). It is then not surprising that chemists like to depict
reaction mechanisms in the form of diagrams and graphs. Christiansen63 was perhaps the
first to use graphs to demonstrate the difference in the mechanisms of reactions involving
open, cyclic, and mixed sequences of steps. In Christiansen's graphs edges represent the
mechanism steps, whereas vertices stand for the reactants, intermediates, and products.

Graph theory is a universal formalism that produces mathematical models of systems


whose elements are in binary relations. This predetermined its wide application to
chemical kinetics and catalysis and its first use as a tool for deriving steady-state rate
laws64-66 of linear mechanisms. These are mechanisms whose steps incorporate one
reaction intermediate on both the left- and right-hand sides.
Graph-theoretical studies of multiroute reactions with linear mechanisms67.68 have shown
that, proceeding from graph theory, one can build a mechanism classification system
(classification of the mechanism topological structure),69 and enumerate and code all
classes of mechanisms with any number of reaction routes. 70.71 Solving these problems
was essential for the development of computer-assisted methods for the generation of
mechanistic hypotheses and for their discrimination. 28 These problems will be discussed
below in more detail.

3.2. DEPICTING LINEAR MECHANISMS


3.2.1. Kinetic Graphs

A convenient version of cyclic graphs has been proposed by M. I. Temkin72,73 to depict


any kind of linear mechanism. These graphs, which we shall call kinetic graphs (KGs),
incorporate only intermediates as vertices. Their interconversions are the edges. Consider
as an example the catalytic reaction

A+C~B+P (3.1)

with the following mechanism.

A + Zl ~ Z2 + B (3.2)
~+C~Z3 (3.3)

~~ZI+P (3.4)

where Zl is a catalyst. The mechanism is depicted by the KG in Fig. 3.1, in which the
undirected edges 1 and 2 represent reversible reaction steps, while the directed edge (arc)
3 represents an irreversible step.
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 253

Figure 3.1. The kinetic graph used to depict the catalytic reaction (3.1) whose
mechanism is depicted by eqs. (3.2-3.4).

3.2.2. Depicting Zero Intermediates


To depict steps that do not contain intermediates, Temkin72 suggested the use of the so-
called zero intermediate. This is a hypothetical intermediate Xo whose concentration
equals unity. The zero reactant is placed in an empty vertex. Thus, the mechanism of the
noncatalytic reaction
A+C~B+P (3.5)
(qualified according to Christiansen63 as an open sequence of reaction steps) is described
by eqs. 3.6-3.8 and depicted by the KG in Fig. 3.2a.
(3.6)
(3.7)

(3.8)

(a) (b)

KG2

Fig. 3.2. a). The kinetic graph used to depict a noncatalytic reaction (eq. 3.5) with
hypothetical intermediate. The reaction mechanism is described by eqs. (3.6-3.8). The
empty graph vertex represents the placement of the zero reactant. b) A more detailed
version of kinetic graph 3.2a showing that steps 1 and 2 are reversible by representing
each nondirected edge as two arcs.
254 O. N. TEMKIN ET AL.

3.2.3. Depicting Pendant Venices


In the mechanisms of some reactions, there are steps that contribute neither to the
conversion of reagent to reaction products nor to the resulting stoichiometric equation.
They only reflect the binding of some intermediates or catalysts by reagents or reaction
products. Such elementary reactions are depicted in the figures by pendant vertices
(vertices of degree one).

KG3

Fig.3.3. Kinetic graph that includes the two pendant vertices that were added to the
kinetic graph in Fig. 3.1 to depict the production of two nonactive intermediates.
As an example, consider in Figure 3.3 a KG constructed by adding two additional steps
to the catalytic reaction represented by Fig. 3.1
(3.9)
~ + p;:t ~p (3.10)
Thus, nonactive intermediates are produced. They contribute to the catalyst mass balance
only:

Zt = Zl + Z2 + Z3 + ZIC + Z2P (3.11)

3.2.4. Depicting Multiroute Mechanisms


In summing the steps of a mechanism, the intermediates mutually cancel, and one gets
the reaction stoichiometric equation. Thus, by summing steps 3.2-3.4, depicted in Fig.
3.1, the stoichiometric equation 3.1 results and, similarly, by summing steps 3.6-3.8,
depicted in Fig. 3.2, the stoichiometric equation 3.5 results. In dealing with mechanisms
representing a set of consecutive-parallel steps, stoichiometric equations are obtained
from the respective mechanisms by means of the reaction route method, developed by
Horiuti74 and Temkin72•73 • In the case of linear mechanisms, the reaction route is such a
subset of steps (selected of the total set of steps) whose summing cancels all
intermediates. Routes could also be stoichiometrically empty (0=0).
In the general case of an arbitrary complex mechanism, in order to mutually cancel all
intermediates when the steps are summed, each step must be multiplied by a
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 255

stoichiometric number. Each complete set of such numbers is a reaction route. The
number of such resulting stoichiometric equations is infinite, owing to the infinite number
of sets of stoichiometric numbers. However, it suffices for the reaction kinetics
description to obtain a set of linearly independent routes (vectors) and respective overall
equations (see section 4 for more detailed description of the route method). Here, we
shall mention that the multiroute linear mechanisms are depicted by polycyclic KGs,
whose number of linearly independent cycles equals the number of linearly independent
routes. Examples of such graphs are given below in section 3.3.

The manner in which the routes (cycles) are topologically connected can be used as a
basis for mechanism classification.

3.3. CLASSIFICATION AND CODING OF LINEAR MECHANISMS BASED ON


THE KINETIC GRAPH TOPOLOGY

Proceeding from the one-to-one correspondence between linear mechanisms and KGs,
we pro~sed a hierarchical classification of these reaction mechanisms in an earlier
paper. 7 However, the studies on the enumeration of the linear mechanisms and their
computer storage and retrieval indicated the need for some changes in both the
classification and coding systems. The resulting hierarchical set of classification criteria
is as follows: 71

(i) Number of linearly independent reaction routes (KG cycles), M=1,2,3, ...

(ii) Number of intermediates (KG vertices), N=2,3,4, ...

(iii) Types of interconnection of a pair of KG cycles (classes of two-route


mechanisms; see Fig. 3.4)
Class A - bridging of cycles
Class B - cycles sharing a common vertex
Class C - cycles sharing a common edge
Class Z - disjoint cycles (linkage via other cycles)
Prefix n - number of KG vertices with degree a ~ 3

(iv) Subclasses of mechanism (number of elements connecting a pair of KG cycles):


Subclasses A, A2 , A3 ,. •• (the length of a bridge, I)
Subclasses C, C2 , C3 , ••• (the number of common edges, K)
Subclasses Zo, Zh ~, ... (the number of edges V separating a pair of cycles
lacking connections of type A, B, or C. The case V =0 corresponds to a KG
in which the two disjoint cycles are actually connected by a bridge, one of the
vertices of which belongs to a third cycle; see Fig. 3.5)
(v) Number of vertices in each cycle, N;
256 o. N. TEMKIN ET AL.

0-000 A B
CD C Z=Z
2

Fig. 3.4. The four basic classes of linear mechanisms. Class Z refers to the nonadjacent
pair of cycles 1 and 3. Substituting any loop for a cycle of arbitrary size preserves the
class.

0-0 0--*-0 O----e-O


A =A A2 A3
1

e
C =C
eC2
e C3

e
1

e
oLD Zj
O()O
•Z2
Fig. 3.5. Illustration of different subclasses of linear mechanisms. Class Z refers to the
nonadjacent pair of cycles 1 and 3.

The linear code that results from the above classification criteria is:

(3.12)

It describes mechanisms incorporating reversible steps only; they are depicted by simple
(nondirected) kinetic graphs. The class notation in the linear code is abbreviated; it stands
for the generalized classes and contains superscripts that show the number of times this
particular type of cycle linkage occurs. Instead, one can use specific class notation,
which is not shortened, and list all pairwise cycle linkages (A, B, Cor Z) following their
canonical numbering (For examples, see Table 3.1, vide infra). Kinetic face graphs
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 257

(KFGs) are used to facilitate the canonical numbering of KG cycles, vertices, and
edges. 70 Each vertex in the KFG represents a cycle (a face) in the initial KG, while a
KFG edge represents a KG cycle linkage of type A, B, or C. The lack of an edge
between two KFG vertices means no A, B, or C type of linkage for the respective pair
of cycles in KG; such mechanisms are classified as class Z.

The modifications of our previously adopted classification and coding systems include
the type of reaction mechanism, which was previously denoted in the code by the serial
number introduced for each KFG. The computer elucidation of the linear mechanisms,
however, would require that standard tables of mechanisms be stored with the serial
numbers of all KFGs, whose number increases rapidly for more complex reactions. The
retrieval of the mechanisms coded is facilitated by the use of the new class Z introduced
in the foregoing, and the general prefix n, which is equal to the number of vertices in
the smallest homeomorphic image of all KGs of the class under consideration. The new
code does not contain any symbol for the mechanism type. Yet, preserving the
mechanism type makes sense from the viewpoint of classification. Types of KGs with
increased complexity may be denoted by L = 1, 2, 3, 4, ... , an integer indicating the
total number of pairwise cycle linkages of type A, B, or C in the KG (See Table 3.1,
vide infra). The upper limit of the L value is the number of edges in the complete KFG
(L = M(M-l)/2).

An example of the use of KGs and their coding is given below with the catalytic reaction
of methanol synthesis. One of the mechanisms proposed7S incorporates two reaction
routes with a total of five reaction steps and four intermediates. Hence, it is represented
by a KG containing two cycles, four vertices, and five edges (Fig. 3.6). The mechanism
code includes the class prefix n=2 (two vertices of degree higher than 2).

KG 1 3 Code: 2-4-2-C-2,4

Fig. 3.6. The kinetic graph and linear code of the reaction, described by the
stoichiometric equations 3.18 and 3.19, and by mechanistic steps 3.13-3.17.
258 O. N. TEMKIN ET AL.

(1)
Z'HzO + COZ +::t Z'HZO'C02 (3.13)
(2)
Z'H2O'C02 ;:t Z'C02 + H2O (3.14)
(3)
Z'C02 + Hz +::t Z'COZ'H 2 (3.15)
(4)
Z'COz'Hz + 2H2 +::t Z' H 20 + CH]OH (3.16)
(5)
Z'C02'H2 ;:t Z'HzO + CO (3.17)
------------------------ -------------------------
COZ + 3H2 ;:t CH]OH + HzO (3.18)

CO + 2Hz ;:t CH]OH (3.19)

For digraphs, which refer to mechanisms containing irreversible steps, the code is
supplemented by all edge types, E j , listed according to their canonical numbering. There
are three possibilities for a step direction; it is either forward, reverse, or both. These
three types of mechanism steps are denoted by i, i and e, respectively. To make the
code unique, the priority order i < i < e and the minimum code criterion are used.

In the case of KGs with pendant vertices (Section 3.2.3), the code incorporates also the
total number of such vertices, Np , and in an increasing order, the numbers nl of the base
vertices to which the pendant vertices P are connected. Hence, the code of a linear
mechanism containing irreversible steps and pendant vertices is

(3.20)

As an illustration of the code thus extended we may present the codes corresponding to
the mechanisms with one irreversible step and two pendant vertices, depicted by
Fig. 3.1: 1-3-0-3 - e, e, i, and Fig. 3.3: 1-3-0-3 - e, e, i - 2: 1, 2, respectively. Another
illustration is Table 3.1, which contains the codes of all 111 classes of linear mechanims
with four routes. Detailed tables with all linear mechanisms involving 1-4 routes and 2-6
intermediates are given elsewhere. 67 ,70,71

The mechanism code described above can be converted into a convenient complexity
index by making use of the spanning trees of the KGs and some of their subgraphs \a
spanning tree is an acyclic subgraph that contains all the vertices of the initial graph). 6
The mechanism complexity thus evaluated parallels the mechanism hierarchical ordering
in types, classes, and subclasses. Topological patterns that increase or preserve
complexity were analyzed and presented in complexW flowcharts of potential use in the
computerized elucidation of reaction mechanisms. 71 •
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 259

Table 3.1. All Classes of Linear Mechanisms with Four Reaction Routes
L=3 29 4-B2Z 2CZ 57 4-A2C 2Z 2 85 4-B2CZC 2
1 4-A3Z 3 30 4-B2CZ3 58 5-ABCZ2C 86 4-BC3ZB
2 5-A3Z 3 31 4-BCZ3B 59 5-AC2Z 2B 87 5-BC3ZC
3 6-A3Z 3 32 5-BCZ3C 60 4-AC 2Z 2C 88 6-C sZ
4 4-A2Z2AZ 33 5-BCZ2CZ 61 6-AC 2Z 2C L=6
5 5-A2Z2AZ 34 5-BC2Z 3 62 2_B4 Z 2 89 5-A6
6 6-A2Z2AZ 35 6-C2Z2CZ 63 3-B3CZ2 90 6-A6
7 4-A2BZ3 36 6-C3Z 3 64 3-B2CBZ2 91 4-A5B
8 5-A2BZ3 L=4 65 3-B2CZBZ 92 5-A5C
9 4-A2Z2BZ 37 5_A4 Z2 66 4-B2Z;zC2 93 2_A2B2C2
10 5-A2Z2BZ 38 6-A4 Z2 67 4-B2CZCZ 94 2-A3B3
11 5-ABZ3A 39 3-A3BZ2 68 4-B2C2Z 2 95 3-A2BCA2
12 4-AB2Z3 40 4-A3BZ2 69 4-BC 2Z 2B 96 4-A2C 2A2
13 4-ABZ2B 41 5-A2BAZ2 70 5-BCZCZC 97 3-A3B2C
14 4-ABZ3B 42 4-A2BZAZ 71 3-BC2Z 2C 98 4-A3BC 2
15 5-ABZ3C 43 3_A2B2Z 2 72 5-BC 2Z 2C 99 3-A3C 3
16 5-ABZ2CZ 44 3-A2ZB2Z 73 5-BC3Z 2 100 1-B6
17 5-ABCZ3 45 4-A3CZ2 74 4_C4 Z 2 101 2-B5C
18 5-ACZ3B 46 5-A3CZ2 75 6_C4 Z 2 102 3_B4 C 2
19 5-A2Z2CZ 47 6-A2CAZ2 L=5 103 3_B2C 2B2
20 6-A2Z2CZ 48 5-A2CZAZ 76 3-NBZA2 104 3-B3C 3
21 5-A2CZ3 49 3-AWZ2B 77 4-A2CZA2 105 3-B2CBC2
22 6-A2CZ3 50 3-A2BCZ2 78 4-A2CZ;zA2 106 4-B2C 2BC
23 6-ACZ3A 51 4-A2ZBCZ 79 3-A2ZCB2 107 3-WC4
24 6-ACZ3C 52 4-A2ZCBZ 80 4-A2ZCBC 108 4-B2C4
25 6-ACZ2CZ 53 4-A2CBZ2 81 5-A2ZC 3 109 5-BC4B
26 6-AC2Z3 54 4-AB2Z 2C 82 2-B2CZB2 110 2-C6
27 3-B2Z2BZ 55 4-ABCZ2B 83 3-B2CZBC 111 6-C6
28 3-B3Z 3 56 5-A2ZC2Z 84 4-B2ZC3

3.4. ENUMERATION OF LINEAR MECHANISMS AND THEIR CLASSES

We recently performed the first large-scale enumeration of the theoretically possible


mechanisms of chemical reactions. The linear mechanisms were enumerated by an
original program, KING (KINetic Graphs), which generates exhaustively all
nonredundant KGs for a given number of cycles and vertices.71 The combinatorial
algorithm used for KG enumeration is similar to that used in the GENESIS pro~ram,78
and it employs an approach to graph enumeration developed by Faradzhev et al. 9

All mechanisms having up to six reaction routes and up to 12 vertices were enumerated,
except in the case of M =6 for N = 11, and N = 12, for which the computational time was
unreasonably high (Table 3.2). The number of classes was also enumerated (Table 3.3).
We found that, at a constant number of reaction routes and an increasing number of
intermediates, the number of classes passes through a maximum and behaves close to the
normal distribution. Both tables give evidence for the potential existence of a
260 O. N. TEMKIN ET AL.

Table 32 Total Number of Kinetic Graphs for M-2-6 and N=2-12

2 1 2 4 7 10 14
3 1 3 12 27 65 129
4 1 5 23 85 276 764
5 1 6 43 210 924 3403
6 1 8 72 469 2652 12644
M\N 8 9 10 11 12
2 19 24 30 37 44
3 245 422 710 1113 1710
4 1935 4466 9583 19291 36859
5 11242 33156 89789 224621 526346
6 52727 194909 651008 CE CE
eE - combmatonal explosIon

.. Total Number of Classes for M=2-6 and N=2-12·


Table 33
N 2 3 4 5 6 7
M
2 1 1 1 0 0 0
3 1 2 6 3 2 1
4 1 4 14 24 33 19
5 1 5 30 85 192 249
6 1 7 55 239 798 1746

I M\N I 8 I 9 I 10 I 11 I 12 I I
2 0 0 0 0 0
3 0 0 0 0 0
4 11 4 1 0 0
5 250 153 77 26 7
6 2800 3082 2576 CE CE
*N for a class mcludes vertIces WIth . ~ 2. as we as all 100ps.
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 261

a sharp contrast to some estimates based on mechanistic chemical, but not topological,
tremendousl~ large variety of topologically distinct linear mechanisms. This result is in
information. 0,81 Evidently, in order to be complete, any mechanism enumeration should
take into account all possible interrelations of reactants, elementary steps, and reaction
routes. Besides the incompleteness of the purely chemical approach, such comparisons
may also indicate that some mechanisms that are topologically allowed might be
chemically forbidden. The elucidation of this important question needs further study.

It should be mentioned that the enumeration reported in Table 3.2 is also incomplete. It
refers to mechanisms containing only reversible steps. Indeed, a specified number of
mechanisms with irreversible elementary reactions can be deduced for each of the
mechanisms counted in Table 3.2. Graph-theoretically, this is the problem of counting
the digraphs (graphs containing both arcs and edges, called also "mixed graphs'~ that
correspond to a certain nondirected graph; i. e., the edge coloring problem. However,
this problem is complicated by the fact that some digraphs do not correspond to any
mechanism. Another extension of the enumeration procedure may handle mechanisms
with reaction intermediates that are involved in equilibrium steps only. In terms of graph
theory, this problem can be reformulated as counting the number of graphs with pendant
vertices that correspond to each of the digraphs of interest. Finally, after the exhaustive
topological enumeration described above, one could search for procedures that would
produce an even larger number of theoretically possible mechanisms by accounting for
their chemical specificity. Different classes of chemical reactions or reactants may be
incorporated into our enumeration scheme by regarding graphs with weighted edges
andlor vertices. The results obtained by all these calculations will be a subject of a future
publication. 82

4. Application of Bipartite Graphs and Stoichiometric Matrices to the Description


of Linear and Nonlinear Reaction Mechanisms
4.1 BACKGROUND

The major difficulty in the graph-theoretical description of reaction mechanisms is the


representation of nonlinear elementary steps. Kinetic graphs are no longer useful, since
there is no binary relation (one educt - one product) for reaction intermediates. Several
attempts were made to depict nonlinear mechanisms by making use of different
techniques. Temkin introduced additional or "secondary" edges in KG,73 however, this
method proved to produce ambiguity in the mechanism description. Hom83 and
Williamowski84 used graphs with vertices corresponding to complexes of reactants.
Vol'pert8S proposed a more universal technique, namely, directed bipartite fraphs. (The
same ideas have been independently developed by Balandin86 and Clarke. ") These are
graphs whose vertices can be partitioned into two subsets such that no two vertices in the
same subset are adjacent. These graphs were discussed by Yablonskii and co-workers.66 ,88

In this section we develop another version of the latter approach in which we deal with
the set of intermediates instead of the complete set of reactants. Thus, Vol'pert graphs
are used in the space of intermediates only. Removal of reactants and products enables
the unified handling of both linear and nonlinear mechanisms. 67 An additional advantage
of our approach is that the space of intermediates alone can indicate the mechanistic
topology. In dealing with the latter, we take into account the mechanisms basic structure
262 o. N. TEMKIN ET AL.

and omit the superfluous chemical details, which make the procedure concrete.

4.2. SOME DEFINITIONS

4.2.1. Bipanite Graphs


A bipartite graph (BO), corresponding to a reaction mechanism, contains_a set of vertices
V = {v" vz, ... , vo}, IVI = n, di\dded into two subsets, Y = V lJ V. The first one
matches the set of intermediates V = {v" Yz, ... , Vo}, IVI = n;_ the oth~ subset
corresponds to the set of elementary reactions V = {v".Yz, ... , Vo}, IVI = ii, n + ii =
n. No vertices in the same subset are adjacent. Vertex Vi is adjacent to a vertex Vj if

(1) Xi .... Xk

(5) Xi .... 2Xk

Fig. 4.1. The complete set of elementary reaction types and their bipartite graphs. Open
circles indicate intermediates; solid circles indicate elementary reactions
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 263

there is an intermediate i that is an educt in reaction j. Vertex Vt is adjacent from a


vertex Vj if there is an intermediate k that is a product in reaction j. The bipartite graph
is thus a directed graph.
In section 2, we have already stated that most elementary reactions are either
monomolecular or bimolecular, trimolecular reactions being rare exceptions. We may
therefore assume that no more than two intermediates (the same or distinct ones) could
react with each other or could result from an elementary reaction. Hence, the complete
list of all possible types of elementary reactions can be obtained. The latter are shown
in Fig. 4.1 with the bipartite graphs that match them.

Consider as an illustration the simplified Benson mechanism89 of ethane pyrolysis


(~lIts =C2H4 + HJ shown by eqs. 4.1-4.6 and depicted in Fig. 4.2.

Fig. 4.2. The bipartite graph depicting the simplified Benson mechanism of ethane
pyrolysis89 , presented by eqs. 4.1-4.6

CH3 •
C2H 6

+ C2H6
-- 2CH3 •

CH, + C2Hs ·
(4.1)

(4.2)

C2H s · ;::t C2H, + H· (4.3)

+ C2H6 +

--
H· ;::t C2H s · H2 (4.4)

2C2H s · C2H 4 + C2H 6 (4.5)

2C 2Hs · C4H lO (4.6)


264 O. N. TEMKIN ET AL.

It should be noted that in Fig. 4.2 elementary reactions (with definite directions) are
considered but are not elementary steps. As seen in Fig. 4.2, double edges appear in the
BG for any reaction that includes the same two products or educts. However, loops are
prohibited. One may therefore summarize that the BG of a reaction mechanism is a
directed graph with multiple edges and no loops.

4.2.2. Simple Routes


A reaction mechanism can induce an infinite set of reaction routes. However, this set can
be limited by using linearly independent routes. 66 ,74,90 The number M of the latter can be
obtained by the Horiuti rule74 ,90 from the stoichiometric matrix Bx and the number s of
elementary steps

M = s - rank Bx (4.7)

It was shown earlier for linear reaction mechanisms72 that some of the routes represent
cycles in Temkin's kinetic graphs. Hence, the number of independent routes p can be
obtained from the cyclomatic number of the graph by the equation

P1(G) = IE(G) I - IV(G) I + Po(G) (4.8)

Here, E(G) and V(G) are the graph edge sets and vertex sets, respectively; IE(G) I and
IV(G) I are the total number of edges and vertices, respectively; ~(G) is the number of
components of G, whereas P1(G) is the cyclomatic number of G. 9,91 Proceeding from
eqs. (4.7) and (4.8), and taking p = P1(G) and s = IE(G) I , one obtains for linear
mechanisms (described by connected KGs for which Po(G) = 1)

rankB x = IV(G)I-1 (4.9)

Keeping in mind that IV(G) I is the number of intermediates, one arrives at the chemical
interpretation of this fact: There is one material balance equation for intermediates, which
makes the number of linearly independent intermediates one less than their total number
IV(G) I. ;t;.q. (4.9) can also be obtained from the theorem for the rank of the incidence
matrix B59 , after recognizing the Bx matrix as an incidence matrix.

Unfortunately, nonlinear routes induce no ordinary cycles (circuits). The pattern of


multiroute reactions is much more complicated. However, if zero intermediates are
considered, elementary reactions that form a mechanistic route correspond to a cyclic BG
subgraph without terminal vertices, whose v vertices are of the same degree as in the
overall BG.

Some of the reaction routes are of special interest, because they cannot be subdivided
into simpler routes and form a finite set. This was pointed out by Milner,92 who called
them direct paths. These were later used by Happel and Sellers,81,93 and in our recent
work. 94 In this chapter, we call them simple routes and discuss them in more detail.

Consider a mechanism, represented by a KG and a stoichiometric matrix, in which


different routes can be defined but only two of which are linearly independent:
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 265

(1) XI + A .... X2 1 0 1 2 3
(2) X2 + B +:t XI +C 1 -1 0 1 2
(3) X2 + D .... XI + E 0 1 1 1 1

1 -1 1
Bx = 2 1-1
3 -1-1

It is seen that routes PI, P2, and P3 are simple and that they have overall route equations:
(PI) A + B .... C
(p:z) C + D .... E + B
(P,) A + D .... E
Each pair of the elementary reactions (1,2), (-2,3), and (1,3), if regarded as a
mechanism, builds one route and cannot be divided further into two or more other routes.
However, only two of these three routes are independent (M = s - rankB. = 2).
'4
On the other hand, routes and P5 are not simple because they can be divided further
into simple ones. Thus, a part of the overall mechanism (called here "a submechanism")
could correspond either to a simple route, to a complex route, or to no route at all. 94
Consider now the ith submechanism related to the B" stoichiometric matrix. In order to
define this submechanism class we need to fmd M, according to Horiuti's rule. If ~ ~
1, then the sub mechanism could be decomposed into simpler ones. If Mj = 1, it is
regarded as simple (Further along in this text simple mechanisms will be called simple
routes). The complete set of simple routes can be enumerated and generated for any
reaction mechanism.94 Such sets are called here trivial sets of reaction routes.
4.2.3. Reduced Adjacency Matrix. The Shrinking Procedure
For undirected BG we defmed a reduced adjace1J!:Y. 17IIllJj.x (RAM), A' = II a;/ II , whi~h
is an .n.xii matrix. Its entries are 3;j' =1 if vertex Vi' Vi E V is adjacent to vertex ~, Vj E V,
and av' =0 otherwise. For directed BGs the RAM, A" = II aq" II , is defineo so as to
account for the adjacency direction:
3;j" = -1, if Vi is adjacent 10 Vi' and
266 O. N. TEMKIN ET AL.

a;( = + 1, if Vi is adjacent from Vj

Keeping in mind that subsets V and V represent the intermediates and steps, respectively,
whereas the absolute value of a RAM entry equals the multiplicity of the respective
directed edge (arc), one arrives at a one-to-one correspondence between the directed
bipartite multigraph and the stoichiometric matrix B •. Hen~e, an operation on graphs can
be defined, that transforms the ,graph of a route to a V-vertex graph. Clearly, this
operation includes contracting of V-type vertices. If now a sum of two equations is taken,
this operation will correspond to the removal of two v-vertices and to the addition of one
new v-type vertex. The procedure, which we call shrinking, is illustrated by eqs. 4.10-
4.13 and Fig. 4.3:

Xl

c[ X Y
X2
... 2 X3

"'0

II
(1-2)
Stage 1

Fig. 4.3. Two-step contracting procedure (shrinking) for bipartite graphs: a) graph of
reactions (1) and (2) before the contracting; b) graph of the overall reaction (1-2) before
the contracting; c) graph of the overall reaction with identical intermediates cancelled.
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 267

(1) (4.10)

(2) (4.11)

(1-2) 2xI + X2 ... 2X2 + Xl + A (4.12)

(1-2)' 2xI ... X2 + Xl + A (4.13)


In Section 4.3 we will show that the shrinking procedure opens the way to the
enumeration of both linear and nonlinear mechanisms.
4.2.4. Kinetic Graphs Versus Bipanite Graphs
Table 4.1. Comparison of kinetic graphs and bipartite graphs

Features KG BG
Graphical directed or nondirected directed bipartite
characteristics multigraphs multigraphs
Application linear mechanisms both linear and
nonlinear mechanisms

Stoichiometric incidence matrix reduced adjacency


matrix matrix
Property, induced by cyclicity id v.'1 = od _I
v''
the stoichiometric j = 1, ... , n
numbers

simple route circuit BG subgraph without


terminal vertices

4.3. TOWARD A COMMON CLASSIFICATION, CODING, AND ENUMERATION


OF LINEAR AND NONLINEAR MECHANISMS

Simple routes are a convenient basis for a common classification of both linear and
nonlinear mechanisms. The face graph (FG) derived from the BG of the mechanism
plays a central role in this procedure. FG vertices represent simple routes, whereas FG
edges stand for the simple route linkages. These FGs differ from those proposed earlier
(under the name supergraphs) for KGs,67,69,70 because generally simple routes are not
identical with KG routes. Yet, the general classification of reaction mechanisms follows
most of the principles of the classification of linear mechanisms. Thus, the first
classification criterion is the number of simple routes, instead of the number of linearly
independent routes, both being expressed by the number of vertices in the respective
facegraph. A mechanism type is introduced according to the total number of edges in
the FG, E = 1, 2, 3, ... Then, the different nature of the FG edges determines the
mechanism class.
268 O. N. TEMKIN ET AL.

Consider in more detail the possible linkages between simple routes. Nine elementary
reaction types were shown in Fig. 4.1. They may be labeled by the consecutive)etters
k. I•...• t, but omitting o. Hence, each FG edge can be coded by the term <nlk, I,
..• >. If two simple routes have no common s~s the right part of the edge label is
empty: < ii 10 >. For example, the term < 61 k1)1 > means that two simple routes are
linked by a common subgraph with six vertices via the elementary reactions of types k,
k. p. and I.

k k
2 4

Indeed, the classification follows the increasing number of n, and, for each nvalue, the
lexicographic priority order is used (k < I < m... ).

In order to complete the classification and coding procedure one also needs to classify
the simple routes (the FG vertices). This is related with the difficult task of enumerating
simple routes, which is not yet solved. On the other hand, the enumeration of the FGs
is actuallv the enumeration of simple connected graphs, which is discussed by Harary and
Palmer. 95

Consider as an illustration the nonlinear reaction mechanism of butane dehydrogenation, 73


which is depicted by the BG in Fig. 4.4.

Fig. 4.4. The bipartite graph depicting the mechanism of butane hydrogenation.
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 269

---
N(1) N(2) N(3) N(4)

(1) Z + C.HJO ZC4HJO 1 1 0 1

--
(2) ZC.HJO ZC4HS + H2 1 1 0 0
(3) ZC.Hs ZC4~ + H2 0 1 0 0
(4) ZC.Hs Z + C.Hs 1 0 0 2
(5) ZC4~ Z + C.~ 0 1 1 0
(-5) Z + C4H6 - ZC.~ 0 0 1 1
(6) ZC.~ + ZC.HJO - 2ZC.Hs 0 0 0 1

The number of simple routes in this mechanism, determined by the method of Zeigarnik
and Temkin,94 was found to be equal to the number of linearly independent routes.
These four simple routes are the sets of stoichiometric numbers, given in the series N(l),
N(2), N(3), and N(4). Then, the FG of this mechanism can be build as follows:
N(2)

N(l) ----....;;.;;..-------". N(3)

Hence, the following mechanism code results: 4-5-3,k2-1,O-3,~-2,k-2,k-2-k, where 4 is


the number of intermediates, 5 is the number of nonempty simple route linkages, and the
six linkages (FG edges) are listed consecutively (12, 13, 14, 23, 24, and 34).
5. Topological Aspects of Complex Reaction Mechanisms
5.1. BACKGROUND

Chemical reaction mechanism, in its most general context, implies the interpretation of
all available experimental facts and theoretical estimates, related to a certain com~lex
reaction. When the sequence of reaction steps (or, otherwise, the reaction scheme ) is
depicted as a graph (KG, BG), one obtains a structure that mirrors the interrelations of
intermediates and the connectedness of reaction routes in the space of intermediates. In
the case of linear mechanisms, the graph representation provides the sequence of reactant
transformations92 without making use of any chemical information, and without any
discrimination of "chemical" hypotheses. The accumulated knowledge on reaction
mechanisms evidences in a convincing waf8 that there is no hierarchy of the different
stages of mechanism studies and description of the type "reaction scheme" - "reaction
mechanism". The two types of mechanism information are interrelated, and obtaining just
270 O. N. TEMKIN ET AL.

one of them for any class of reactions is simply impossible (the only exception is the
class of simple one-route reactions).

We developed a concept, according to which the notion of "mechanism" includes two


equally important types of information, topological information and r.hysicochemical
information, which reflect two different aspects of this phenomenon. 2 .29

Topological information mirrors the mechanistic topological structure (MTS), i. e., the
reactant interrelations, the number and kinds of reaction routes, and their mutual
connectedness. The MTS identification can be done by making use of techniques such
as chemical kinetics, isotope studies, chemical modeling of steps and intermediates,
physicochemical analysis, etc. Taking into account (i) the use of FGs (both for KGs and
BGs) as a topological basis of the classification of mechanism types, (ii) the topological
invariants introduced for the mechanism classes, and (iii) the topological nature of the
mechanistic classification proposed by Christiansen63 (open, closed, and mixed sequence
of steps), we proposed to call such a mechanistic structure topological (MTS).
Physicochemical information reflects intermediate composition and structure at different
levels of elaboration, their reactivity (rate constants, reversibility, the presence of fast
and slow steps), the structure of transition states.

Critical reviews of methodologies, applied in mechanistic studies and kinetic


modeling,2I·29,97 have shown that the traditional research strategy

Experimental kinetic data .. kinetic law equations ..


reaction scheme .. reaction mechanism

is neither unique nor rigorous and that it is generally nonapplicable to multiroute


reactions.

Consequently, this section deals with those aspects of the mechanistic studies of complex
reactions that are most closely related to the concept of mechanistic topological structure
and, first of all, with the graph-theoretical analysis of the mechanistic types.

5.2. TOPOLOGICAL CHARACTERIZATION OF THE FOUR MAJOR CLASSES OF


COMPLEX REACTIONS (NONCATALYTIC, NONCATALYTIC CONJUGATED,
CHAIN, AND CATALYTIC REACTIONS)
A large number of chemical reaction classifications exist in the literature,98 One possible
classification system is based on the different types of mechanisms. We have already
discussed such types as linear and nonlinear mechanisms, as well as one-route and
multiroute reactions. Sinanoglu introduced two very general classes of mechanisms (or
reaction networks), laminar and turbulent ones, and presented methods for their
systematical and topological generation. 99

Depending on the presence or absence of a substance that speeds up the reaction but is
not included in the stoichiometric equation, one can distinguish between catalytic and
noncatalytic reactions. The latter can be divided into conjugated and nonconjugated
reactions depending on the presence or absence of route interdependence. In tum,
conjugated reactions can be chain or nonchain ones, the criterion being the type of the
sequences of intermediate transrormations, or SIT (open, closed, and mixed ones,
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 271

after Christiansen63). Indeed, more complex combinations of mechanisms are possible,


though they will not be regarded here.

It is worthwhile analyzing all these types of mechanisms in order to find whether:


there are:
(i) any especially notable specifics in the topology of the different mechanisms, when
making use of cyclic kinetic graphs, and

(ii) any features of possible topological classes in the different SITs (by analogy with the
classes of topological identifiers for elementary reactions).
The detailed analysis of these questions will be published elsewhere. 100 Here, however, we
will briefly comment on the preliminary results that are presented in Table 5.1 for
mechanisms with one or two linearly independent routes:
(1) Conjugated and chain processes, differing from the nonconjugated noncatalytic
processes and the catalytic ones, have not less than two routes.

(2) Within the topology of Temkin's cyclic KGs, there is no difference between
noncatalytic, catalytic, and chain processes. A special class C (see section 3) is singled
out for congugated reactions only.

(3) The major difference in topology of these four types of reactions is in the nature of
their SITs (i.e., in the nature of KG cycles).

(4) The SITs of noncatalytic and noncatalytic conjugated reactions belong to linear,
bilinear, or branched linear topological classes.

> >-
(5) The SITs of catalytic reactions are classified into cyclic or polycyclic topological
classes.

(6) The SITs of chain processes correspond to mixed branched-cyclic topological classes.
(7) Chain reactions are a specific type of conjugated reaction, one of the routes of which
is transformed into a cyclic SIT. Alternatively, chain reactions may be regarded as a
specific type of catalytic process (nonideal catalysis), one of the routes of which turns
into an open SIT ( a route with linear topology).
It is interesting to note that, while they differ by many classification criteria, catalytic,
noncatalytic, and chain reactions belong to the same equivalence class when the criterion
used is the SIT nature (open or closed) or the mechanism topological class (linear, cyclic
or combined).
N
....,
Table 5.1. Topological characteristics of the mechanisms of complex reactions. Examples with one-route and two-route N
mechanisms
N Reaction type Number of KG topological features SIT topological classes
routes M KG vertex tXpe class SIT tXpe63 SIT to~logX SIT depiction

Noncatalytic ~ 1 0 zero vertex 0 open linear -


(solid circle)
reactions
~~---.-.-. ..
zero vertex 8 open bilinear
CD zero vertex C open branched-
linear
2 Conjugated ~2
CD zero vertex C open branched-
>-
reactions CD linear

3 Catalytic ~ 1 no zero 0 closed cyclic


>-
0 0
reactions CX) vertex 8 closed cyclic

C closed cyclic
CD
4 Chain ~2
CD zero-vertex 8 open-closed mixed
9
:z
CD tIl
~
~

reactions C open-closed mixed


~ Z
(D -0- ~
r>
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 273

6. References

1. Dewar, M.J.S. (1969) The Molecular Orbital Theory of Organic Chemistry,


McGraw-Hill, New York.
2. Simonetta, M. (1974) in G. Klopman (ed.), Chemical Reactivity and Reaction
Pathways, Wiley, New York, Chapter 2.
3. Fukui, K. (1970) Topics Curro Chern. 15, 1.
4. Fleming, I. (1976) Frontier Orbitals and Organic Reactions, Wiley, New York.
5. Dewar, M. J. S. (1989) Theochem 200, 301.
6. Woodward, R. B. and Hoffmann, R. (1965) J. Am. Chern. Soc. 87, 395; (1969)
Angew. Chern. 81, 797.
7. Hoffmann, R. (1981) Science 211,995.
8. Parr, R. G. and Pearson, R. G. (1983) J. Am. Chern. Soc. 105, 7512.
9. Pearson, R. G. (1986) Proc. Natl. Acad. Sci. 83, 8440.
10. Pearson, R. G. (1993) Acc. Chern. Res. 26, 250.
11. Hammett, L. P. (1970) Physical Organic Chemistry, 2nd ed., McGraw Hill, New
York.
12. Taft, R. W. (1956) in M.S. Newman (ed.), Steric Effects in Reactivity, Wiley, New
York, Chapter 13.
13. Palm, V. A. (1967) Fundamentals of Quantitative Theory of Organic Reactions,
Khimiya, Leningrad (in Russian).
14. Shorter, J. (1973) Correlation Analysis in Organic Chemistry: An Introduction to
Linear Free-Energy Relationships, Clarendon Press, Oxford.
15. Basolo, F. and Pearson, R. G. (1967) Mechanisms of Inorganic Reactions, 2nd ed.,
Wiley, New York.
16. Taube, H. (1984) Science 226, 1028.
17. Bell, R. P. (1973) The Proton Chemistry, Chapman & Hill, London.
18. Dogonadze, P. P. and Kuznetsov (1973) A.M. Kinetics of Chemical Reactions in
Polar Solvents, VINITI, Moscow (in Russian).
19. Denisov, E. T. (1978) Kinetics of Homogeneous Chemical Reactions, Vysshaya
Shkola, Moscow, and ref. cited therein.
20. Ingold, C. K. (1953) Structure and Mechanism in Organic Chemistry, Cornell
University Press, Ithaka, New York.
21. Masters, C. (1981) Homogeneous Transition-Metal Catalysis, Chapman & Hill, ,
London.
22. Rudakov, E. C. (1985) Reactions of Alkanes with Oxidants, Metal Complexes, and
Radicals in Solutions, Naukova Dumka, Kiev (in Russian).
23. Shulpin, G. B. (1988) Organic Reactions Catalyzed with Metal Complexes, Nauka,
Moscow.
24. Temkin, O. N., Shestakov, G. K., and Treger, Y. A. (1991) Acetylene: Chemistry,
Reaction Mechanisms, and Technology, Khimiya, Moscow (in Russian).
25. Brin, E. F. (1987) Usp. Khim. 54, 428.
26. Trinajstic, N. (1988) Stud. Phys. Theor. Chern. 63,557.
27. Bazhenov, L. B. and Samorodnitskii, P. Kh. (1976) Voprosy Filosofii, 93.
28. Temkin, O. N., Bruk, L. G., and Zeigamik, A. V. (1993) Kinetics and Catalysis.
English Translation 34, 387.
29. Temkin, O. N. Bruk, L. G., and Bonchev, D. (1988) Teor. Eksp. Khim. 24, 282.
30. Kamenski, D. I., Temkin, O. N., and Bonchev, D. G. (1992) Appl. Catal. CA) 88,
1.
31. Kolbe, H. (1850) Annal. Chern. Pharm. 75, 211; 76, 1.
274 O. N. TEMKIN ET AL.

32. Hueckel, W. (1934) Theoretische Grundlagen der Organischen Chemie,Academic


Verlagsgesellschaft, Leipzig.
33. Stevens, B. (1967) Collisional Activation in Gases, Pergamon Press, Oxford.
34. Jochum, C., Gasteiger, J., and Ugi, I. (1980) Angew. Chern. lnt. Ed. En~l. 19,495.
35. Jochum, C., Gasteiger, J., Ugi, I., and Dugundji, J. (1982) Z. Naturforsch. B 37,
1205.
36. Ugi, I. and Wochner, M. (1988) THEOCHEM 165, 229.
37. Koca, J., Kratochvil, M., Kvasnicka, V., Matyska, L., and Pospichal, J. (1989)
Lect. Notes Chern. 51, Springer-Verlag, Berlin.
38. Dugundji, J. and Ugi, I. (1973) Topics Curro Chern. 39, 19.
39. Maier, L. I. (1990) Dissertation, Novosibirsk (in Russian).
40. Zabolotnaya, L. G. (1989) Dissertation, Novosibirsk (in Russian).
41. Steingauer, L. G., Maier, L. I., Bulgakov, N. N., Fedotov, A. V., and Likhobolov,
V. A. (1988) React. Kinet. Catal. Lett. 36, 139.
42. Shtokolo, L. I., Steingauer, L. G., Likhobolov, V. A., and Fedotov, A. V. ~
Kinet. Catal. Lett. 26, 227.
43. Epiotis, N. D. (1982) Lect. Notes Chern. 29, Springer-Verlag, Berlin.
44. Epiotis, N. D. (1983) Lect. Notes Chern. 34, Springer-Verlag, Berlin.
45. Zefrrov, N. S. and Trach, S. S. (1982) Zh. Org. Khim. 18, 1561.
46. Zefirov, N. S. and Trach, S. S. (1987) Acc. Chern. Res. 20, 237.
47. Zefirov, N. S. and Trach, S. S. (1975) Zh. Org. Khim. 11, 225; 11, 1785.
48. Zefirov, N. S. and Trach, S. S. (1976) Zh. Org. Khim. 12, 7; 12, 697.
49. Zefirov, N. S. and Trach, S. S. (1981) Zh. Org. Khim. 17, 2465.
50. Zefrrov, N. S. and Trach, S. S. (1990) Anal. Chim. Acta, 235, 115.
51. Gilchrist, T. L. and Storr, R. C. (1972) Organic Reactions and Orbital Symmetry,
Cambridge University Press, London.
52. Epiotis, N. D. (1978) Theory of Organic Reactions, Springer-Verlag, Berlin.
53. Dewar, M. J. C. (1984) J. Am. Chern. Soc. 106, 209.
54. Evans, M. G. and Warhurst, E. (1938) TranS. Faraday Soc., 34, 614.
55. Evans, M. G. (1939) TranS. Faraday Soc. 35, 824.
56. Dewar, M. J. C. (1971) Angew. Chern. Int. Ed. Engl, 10, 761.
57. Zeigarnik, A. and Temkin, O. N. (1993) Match (work in submission).
58. Tolman, C. A. (1972) Chern. Soc. Rev. 1, 337.
59. Harary, F. (1969) Graph Theory, Addison-Wesley, Reading, MA.
60. Fujita, S. (1986) J. Chern. lnf. Comput. Sci. 26, 205, 212, 224, 231, 238.
61. Fujita, S. (1987) J. Chern. lnf. Comput. Sci. 27, 99, 104, 111, 120.
62. Jorgensen, K. A. (1989) Chern. Rev. 89, 431.
63. Christiansen, J. A. (1953) Adv. Catalysis 5, 311.
64. King, E. L. and Altman, C. (1956) J. Phys. Chern. 60, 1375.
65. Yatsimirski, K. B. (1975) lnt. Chern. Eng. 15, 7.
66. Yablonskii, G. S., Bykov, V. I., and Gorban' ,A. N. (1983) Kinetic Models of
Catalytic Reactions, Nauka, Novosibirsk.
67. Temkin, O. N., Bonchev, D. (1992), in D. Bonchev and D. H. Rouvray (eds.),
Mathematical Chemistry, Vol. II. Graph Theory. Introduction and Fundamentals,
Gordon and Breach, Chichester, U. K., Chapter 2.
68. Temkin, O. N. and Bonchev, D. (1992) J. Chern. Educ. 69,550.
69. Bonchev, D., Temkin, O. N., and Kamenski, D. (1980) React. Kinet. Catal. Lett.
15, 113.
70. Bonchev, D., Temkin, O. N., and Kamenski, D. (1982) J. Comput. Chern. 3,95.
71. Gordeeva, K, Bonchev, D., Kamenski, D., and Temkin. O. N. (1993) J. Chern. Inf.
GRAPH-THEORETICAL MODELS OF COMPLEX REACTION MECHANISMS 275

Comput. Sci., in press.


72. Temkin, M. I. (1965) Dold. Acad. Nauk SSSR 165, 615.
73. Temkin, M. I. (1970), in Roginskii, S. Z. (ed.), Mechanism and Kinetics of
Complicated Reactions, Nauka, Moscow, p. 57 (in Russian).
74. Horiuti, r. (1957) r. Res. Inst. Catal. Hokkaido Univ. 5, 1.
75. Rozovskii, A. Y. and Lin, G. I. (1990) Theoretical Foundations of the
Methanol Synthesis Process, Khimiya, Moscow (in Russian).
76. Bonchev, D., Temkin, O. N., and Kamenski, D. (1980) React. Kinet. Catal. Lett.
15, 119.
77. Bonchev, D., Kamenski, D., and Temkin, O. N. (1987) r. Math. Chern. 1, 345.
78. Gordeeva, E. V., Molchanova, M.S., and Zefl1'ov, N. S. (1990) Tetrahedron Compo
Method. 3, 389.
79. Faradzhev, I. A. (1978), in: Algorithmic Investintions in Combinatorics, Nauka,
Moscow, p. 11 (in Russian).
80. Sellers, P. H. (1971) Arch. Ration. Mech. Anal. 44, 23, 376.
81. Sellers, P. H. (1983), in: R. B. King (ed.), Chemical Applications of Graph Theors,
Elsevier, Amsterdam, p. 420.
82. Gordeeva, E., Bonchev, D., and Temkin, O. N., work in progress.
83. Hom, F. (1973) Proc. Roy. Soc. A334, 299; 313.
84. Williamowski, K.-D. (1978) Z. Naturforsch. 33a, 827; 983.
85. Vol'pert, A. I. (1972) Matematicheskii Sbornik 88, 578.
86. Balandin, A. A. (1970) Multiplet Theory of Catalysis, Part III, Moscow State
University, Moscow.
87. Clerke B. L. (1974) r. Chern. Phys. 60, 1493.
88. Yablonskii, G. S. (1979) Teor. Eksp. Khim. 15, 4.
89. Benson, S. (1960) The Fundamentals of Chemical Kinetics, McGraw-Hill, New
York.
90. Horiuti, r and Nakamura, T. (1957) Z. Phys. Chern .. Neue Folge 11, 358
91. Tutte, W. T. (1984) Graph Theory, Addison-Wesley, Reading, MA.
92. Milner, P. C. (1964) J. Electrochem. Soc. 111, 228.
93. Happel, J. and Sellers, P. H. (1982) Ind. Eng. Chern. Fundam. 21, 67.
94. Zeigarnik, A. V. and Temkin, O. N. (1993) Kinet. Catal., submitted.
95. Harary, F and Palmer, E. M. (1973) Graphical Enumeration, Academic Press, New
York.
96. Schmid, R. and Sapunov, V. N. (1982) Non-Formal Kinetics, Chemie, Veinheim.
97. Temkin, O. N. and Bruk, L. G. (1982), in Mechanism of Catalytic Reactions,
Proceedings of 3rd All-Union Conference, vol. 2, Nauka, Novosibirsk, p.l01.
98. Bawden, D. (1991) J. Chern. Inf. Comput. Sci. 31, 212.
99. Sinanogly, O. (1993) J. Math. Chern. 12, 319.
100. Temkin , O. H., Zeigamik, A., and Bonchev, D. J. Chern. EduC., work in
submission.

7. Acknowledgments. This work was partially supported by the Russian Foundation for
Fundamental Research Grant no. 93-03-18050. D. Bonchev gratefully acknowledges the
hospitality of Dr. W. A. Seitz (Galveston) and Dr. C. F. Mountain (Houston) during his
sabbatical, as well as the support of the Welch Foundation, Houston, Texas.
INDEX

Absolute hardness 55 Chemical reactions


Activation energy 222 topological classification 216
Adamantane 148 Christiansen graphs 252
Adjacency 39 Closure 207
Alcohols 20 Complexity index 258
Algebraic structure count 67 Complex reaction classes
Alkanes 20, 214 catalytic 270
Alkene epoxidation 251 chain 271
Anticonvulsant activity 235 noncatalytic 272
Approximation noncatalytic conjugated 272
Bloch-Hiickel44 Conformational domains 199
zero-overlap 44 Conjugated systems 58
Aromatic hydrocarbons 20 Constitutional formula 142
Aromatic substitution 63 Contour surface 186
Aromaticity 55 Convex hull 189
Association coefficient 232 Convexity 186
Automerization 141 truncated 186
Automorphism group of a graph 248 Coordination number 74
Azabullvalene 155 Cope rearrangements 141, 155
Correlation analysis 241
Balaban notation 162, 170 Coset graphs 138
Balanced diameters 124 Cyclization (conrotatory, disrotatory) 236
Band filling 79 Cyclomatic number 212, 264
Barbaralyl cation 148 Cyclooctatetraene 157
Bare nuclear potential 193 Cyclopentadiene 149
Bay-region 236 Cyclopentadienyl cation 145
Benson mechanism 263
Benzenoids 47, 53 Degenerate rearrangements 141
Berry pseudorotation 124, 161 partly degenerate 144
4,4-Bicapped square antiprism 118 Degree of saturation 215
Bisdisphenoid 118, 132 Density
Bond-breaking domain approach (DDA) 193
homolytic, heterolytic 215 functional theory 74
Bond-breaking/bond-making 243 Desargues-Levi graph 128, 139, 140, 162
Bond orders I, 13 Determinant
Boranes 86 secular 43
Borazines 214 Diamantane 150
Bravais lattices 81 Diamond hydrocarbons 148
Bronsted acid 150 Diamond-Square-Diamond Process 116
Bullvalene 141, 155 Diastereomerization 142
Butane hydrogenation 268 Digonal twist 170
Direct paths 264
Carbocations 138 Double cosets 164
Catalysis by metal complexes 245 Dynamic NMR 141
Chance correlation 237
Charge densities 61 Edge 2,38
Chemical graph theory directed (arc) 253
review works in 1 hemi-edge 210
277
278 INDEX

multiple 2, 4. 39 complete 9, 41, 116


nondirected 253 complete bipartite 23
secondary 261 components 264
Edge/vertex relationship 112 constitutional 40
Electrocyclic transformation 235 cycle graph 7
Electron counting rules 209 definition 2, 38
Electron density maps 209 directed (digraph) 3
Electronegativities 57, 62 eigenvalue 42
Electronic indices face graph (FG) 267
atomic self-polarizability 230 general 39
Brown index 230 hydrogen-suppressed 40
density similarity 230. 236 highly symmetrical 14
Dewar reactivity number 231, 235 homeomorphism 16
free energy 230 invariants 67
superdelocalizability 230 isomorphism 16
Electronic interactions isospectral (cospectral) 18
hard (charge-charge) 221 Kekulean 30
soft (charge-transfer) 221 labeled 38
Electron transfer 241 mixed 261
Electrophilicity (soft) 232 molecular 40
Elemental Boron 78 non directed 3
Elementary steps (reaction) 242 nonkekulean
enumeration of types 248 path graph 7
Enantiomerization 142 planarity 22, 39
Ensemble of molecules 244 polyhedral 22
Entropy of reaction 222 regular 9, 41
Equilibrium constant 222 scheme 244
Ethane pyrolysis 263 simple 38
Ethyl carbenium ion 138 skeletal 40, 222
Euler relationship 111 spectral theory 45
Euler characteristic 212, 215 transforms 175
Extended Hiickel parameters 76 tree 7
twin 19
Face 109 with a loop 4
Face/edge relationship 111 weighted 4
Factorization 30 Graphoid 210
I-factorable graph 41 GSTE principle 184
Fluxional structures 111
Formal-kinetic approach 242 Hamiltonian
Formal-logical approach 242 circuit 143. 166
Fractal benzenoids 67 line 141
Frontier orbital theory 241 perturbational 57
Fullerenes 67 Hammett equation 66, 223
Fused spheres guided homotopy (FSGH) Hartree-Fock theory 96
195 Heptaphosphide trianion 174
Hexacoordinate complexes 168
Gale diagrams 116. 123 Hexagonal animals 29
standard 125 Hexagonal bipyramid 132
Gale transformations 121 Hexagonal wheel 133
Genus 214 Homeomorphism 210
Graph Homobenzprismyl cation 146
bipartite 23, 40, 261, 267 Homological series 214
chemical 40 Homology groups 184, 188
INDEX 279

HOMO-LUMO face graphs (KFG) 256


crossing 119 graphs (KG) 175,252,261
gap 46,49,51,98 Kuratowski theorem 23
Homotopy groups 184
Homovalenes 142 LCAO approach 1. 42
Homovalenium cations 141 Lewis acid 148
Huckel Lewis diagram 210
eigenvalues 45 Linear free energy relationship
Hamiltonian 42,45 (LFER) 222
MO method (HMO) 1, 42 Linear reaction mechanisms 252
Wscaled theory 74 classification 255
spectrum 45 coding 258
Hydrogen bonding 223 depicting 252
Hydrophobicity 232 enumeration 259
generalized classes 256
Icosahedra 80, 118 Localization energy 63, 229
edge-coalesced 118 Loops 39, 212
Incidence 39
Inclusion-exclusion principle 27 Matrices
Infimum 204 adjacency 1, 3, 41
Information theory 226 distance 4
Information topological indices 226 eigenvectors 1
distance distribution 227 Hamiltonian 44, 233
electropy 228 Huckel44
neighborhood symmetry 227 incidence 264
structural information content 227 rank 264
vertex distance complexity 227 reaction 243
Integrals reduced adjacency 265
Coulomb 43 stoichiometric 264
resonance 43 T-matrix 230
Isoelectronic molecules 62, 214 transfer 28
...-isoelectronic 214 unit 1
Isomer Methanol synthesis 257
count 116 Method of moments 77
enumeration 20 nth moment of energy 233
graphs 20 Minimum code criterion 258
Isomerization (rearrangement) graphs 138 Molecular
Isomers body 181
topological 46 connectivity 222
Isomorphic topological spaces 46 electrostatic potential contours
Isospectral (MEPCO) 182
molecules 45 hardness, softness 241
points 18 isodensity contours (MIDCO) 182
Isosters 214 metric 222
Isostructural molecules 214 propellers 158
Isotopic labeling 141 surface 181
Isovalent molecules 212,214 Van der Vaals surfaces (VDWS) 182
Molecular orbitals (MO) 1, 42
Jahn-Teller effect 171 energy 76
HOMO 13, 46, 75, 94
Kekule structures 12, 30, 41, 54 LUMO 46.94
Kinetic NBMO 228
constants estimation 241 topological 45
280 INDEX

Monoligostatic process 161 topological invariant 209


Mobius molecules 67 Polyhedral isomerization 125
Muetterties notation 161 degenerate 111
Mulliken population 60 planar 111
Multiroute mechanism 254 Polymerizations of graphs 26
Musher process Polynomials
Characteristic 5, 9, 11, 41
Naphthalene 51, 56 Distance 5
Nonbornyl cation 146 Matching 10, 12
Nonlinear mechanisms 261 Spectra 5
classification 262, 268 Z-counting 6
Nonplanar systems 59 Potential energy surface 209
Nuclear potential contours (NUPCO) 193 Primitive trigonal lattice 84
boundary surfaces 194 Principle of
matrices 200, 206 analogy 222
shape changes 199 bond change compensation 251
topological patterns 195 conservation of molecular topology in
chemical reactions 216
Octacoordinate complexes 172 Evans 243
Octahedral complexes 168 maximum hardness 56
Octahedron 118 minimum chemical distance (PMCD)
axially distorted 171 243,251
Operations on graphs 23 minimum reaction participants 243, 251
composition 25 minimum structure change 243
join 25 molecular structure 221
product 25 orbital symmetry 241
union 24 simple bond change topology 251
Operator technique 28 transition state uniqueness 251
Proton transfer 241
Path 245 Pseudo-graph 210
Peierls distortion 171 Pyridine 62
Pentacoordinate complexes 159
Pentagonal bipyramid 118 QSAR, QSPR 17,232
Perfect matching 12
Pericyclic reactions 235, 243 Racemization 164
Perturbation theory (polyelectronic) 221 RBSM (Resolution based similarity
Petersen graph 14, 128, 140 measures) 190
Phosphorane 160 Reaction
Polycyclic aromatic hydrocarbons 235 catalytic 254
Polyhedra center types 245
capping 112 E2, SN2 243
chiral 116 flow diagram 3
cluster 109 graphs 126
combinatorial equivalence 109 intermediate 252
coordination 109 molecularity 251
deltahedra 110 multibond 243
dualization 113 multistage 242
five-vertex 129 networks 2
forbidden 115 one-bond 243
four-vertex 126 path 199
Platonic 22 pathway 97, 235
pseudorotation 120 rate 223
six-vertex 129 scheme 269
INDEX 281

sub mechanism 265 Steady state rate law 252


synchronous 243 Stereochemical formula 142
system 244 Stoichiometric number 255
Reaction mechanism 201, 242 Subgraph 39
physicochemical information 270 isomorphism 39
topological information 270 spanning 39
topological structure 242, 270 Supergraph 267
Reaction routes 252 Supremum 204
simple 264 Surface
independent 264 two-dimensional 212
nonlinear 264 Symmetry
trivial sets 265 of graphs 10
Reactivity number 229 point groups 10
Recurrence Relation 27 topological 11
Reference frame 56 Synthon graph 175
Rhombic (Ray-Dutt) twist 169
Rhombohedral cell 81 Tangent sphere, ellipsoid 188
Rule Tautomerism (cycle-chain) 218
zero-sum (Longuet-Higgins) 228 Tetracoordinate complexes 168
Woodward-Hoffmann 236, 243 Tetracyclotridecanes 152
Tolman 245 Tetracycloundecanes 154
Tetragonal-pyramidal complexes 164
Schlegel diagram 22, 115 Tetrahedron 117
Schrodinger equation 57 Thiophenes 59
Secular equations 43 T-hull
Semilattice Topological
lower, upper 204 charge stabilization 56
Sequence of intermediate transformations homeomorphism 212
(SIT) 270 identifier 244
Sequence of reaction steps invariant 216, 223
closed, open, mixed 270 representation 126, 138
Shannon formula 226, 227 Topological effects on molecular orbitals
Shape (TEMO principle) 46
absolute and relative 185 Topological indices 6, 26, 214, 224
criteria 185 distance connectivity 225
domain 185 distance sum 225
dynamic 191 electrotopological state 226
groups 188 extended connectivity (Morgan) 225
invariance domain 192,201 external fragment (EFT!) 230
local invariance 203 fractional atomic charge 233
Shift generalized connectivity 224
6,2-endo-hydride 146 global 230
3,2-exo-hydride 146 hierarchical extended connectivities
Shrinking procedure 265 (HOC) 226
u-complex 63 Hosoya 6, 228
Symbolic equation 244 internal fragment (lFTI) 231
Similarity largest eigenvalue 230
analysis 183 mean square distance 225
chemical 214 molecular connectivity (Randic) 6, 224
geometrical 184 normalized extended connectivity 226
measures 190 point energy 234
Soil sorption 232 self-returning walks 225
Square antiprism 172 structure count ratio 228, 235
282 INDEX

topological charge 233


topological similarity 230, 235
total adjacency 224
valence connectivity 225
Wiener 6, 225
Zagreb Group indices 224
Topology
classical, chemical 209
Topomers (S, T) 46
characteristic polynomial 48
Toxicity 232, 233
Transition state 63
Triarylboranes 159
444-Tricapped trigonal prism 118
Tricyc10undecanes 153
Trigonal bipyramid 117, 159
Trigonal prismatic complexes 172
Trigonal (Bailar) twist 169
Triphenylborane 158
Turnstile rotation 160
Valence bond method 243
Vertex 2
adjacency 39, 262
connectivity 126
degree 41
pendant 254
pricked 210
Wade rules 74
Wagner-Meerwein rearrangement 146
Walsh diagram 92
Wavefunctions
orthogonal 42
orthonormal 42
Wheland intermediate 63
Wolfsberg-Helmholz approximation 76
Xenon hexafluoride 174
Ylide 212
Zero intermediate 253
Understanding Chemical Reactivity

1. Z. Slanina: Contemporary Theory of Chemical Isomerism. 1986


ISBN 90-277-1707-9
2. G. Naray-Szab6, P. R. Surjan, J. G. Angyan: Applied Quantum
Chemistry. 1987 ISBN 90-277-1901-2
3. V. I. Minkin, L. P. Olekhnovich and Yu. A. Zhdanov: Molecular Design
of Tautomeric Compounds. 1988 ISBN 90-277-2478-4
4. E. S. Kryachko and E. V. Ludeiia: Energy Density Functional Theory
of Many-Electron Systems. 1990 ISBN 0-7923-0641-4
5. P. G. Mezey (ed.): New Developments in Molecular Chirality. 1991
ISBN 0-7923-1021-7
6. F. Ruette (ed.): Quantum Chemistry Approaches to Chemisorption
and Heterogeneous Catalysis. 1992 ISBN 0-7923-1543-X
7. J. D. Simon (ed.): Ultrafast Dynamics of Chemical Systems. 1994
ISBN 0-7923-2489-7
8. R. Tycko (ed.): Nuclear Magnetic Resonance Probes of Molecular
Dynamics. 1994 ISBN 0-7923-2795-0
9. D. Bonchev and O. Mekenyan (eds.): Graph Theoretical Approaches
to Chemical Reactivity. 1994 ISBN 0-7923-2837-X

Kluwer Academic Publishers - Dordrecht / Boston / London

Das könnte Ihnen auch gefallen