Beruflich Dokumente
Kultur Dokumente
TO BIO-GEOMETRY
Herbert Edelsbrunner
Departments of Computer Science and Mathematics
Duke University
Table of Contents
P ROLOGUE i
I B IO - MOLECULES 1
II G EOMETRIC M ODELS 17
III S URFACE M ESHING 35
IV C ONNECTIVITY 53
V S HAPE F EATURES 71
VI D ENSITY M APS 89
VII M ATCH AND F IT 101
VIII D EFORMATION 117
IX M EASURES 125
X D ERIVATIVES 141
[Mention the pioneers who early on recognized the im- and on “Bio-geometric Modeling” in the Spring of 2001 and
portance of geometry in structural molecular biology: Fred the Fall of 2002, all at Duke University. These courses were
Richards, Michael Levitt, Michael Connolly] either taken for credit or audited at least occasionally by
Luis von Ahn, Tammy Bailey, Yih-En (Andrew) Ban, Robert
[Mention that my book on the “Geometry and Topology
Bryant, Ho-Lun Cheng, Vicky Choi, Anne Collins, Abhijit
for Mesh Generation” is complementary/a prerequisite to
Guria, Tingting Jiang, Looren Looger, Ajith Mascarenhas,
this book. In particular, it covers the construction of Delau-
Gopi Meenakshisundaram, Nabil Mustafa, Vijay Natarajan,
nay triangulations in detail, and it describes the simulation
Xiuwen Ouyang, Anindya Patthak, Ken Roberts, Apratim
of simplicity as a general idea to deal with non-generic sit-
Roy, Scott Schmidler, Xiaobai Sun, Yusu Wang, Shumin
uations.]
Wu, Alper Üngör, Peng Yin and Afra Zomorodian.]
[This book is really about alpha shapes in a broad sense.
It might be useful to describe the history of that research in
short. Herbert Edelsbrunner
Durham, North Carolina, 2002
1981. Vancouver. Conception of idea with Kirkpatrick and
Seidel.
1985-89. Graz and Urbana. SoS, Delaunay software, Al-
pha Shape software with Ernst Mücke, Harald Rosen-
berger, and Patrick Moran.
1990-93. Urbana and Berlin. Surface triangulations, Betti
numbers, inclusion-exclusion, CAVE with Ping Fu,
Ernst Mücke, Cecil Delfinado, Nataraj Akkiraju, and
Jiang Qian.
1994-95. Hong Kong. Morphing, molecular skin, with Ping
Fu, Siu-Wing Cheng, Ka-Po Lam, and Ho-Lun Cheng.
1995-98. Urbana. Flow and pockets, skin surfaces with Ho-
Lun Cheng, Tamal Dey, Michael Facello, Jie Liang,
Shankar Subramaniam, Claire Woodworth.
1999-2001. Duke. Skin triangulation, hierarchy, Morse
complexes with Ho-Lun Cheng, Alper Üngör, Afra
Zomorodian, David Letscher, John Harer, Vijay
Natarajan.
2002-2003. Duke and Livermore. Docking, Reeb graphs,
Jacobian manifolds with Johannes Rudolph, Sergei
Bespamyatnikh, Vicky Choi, John Harer, Valerio Pas-
cucci, Vijay Natarajan, Ajith Mascarenhas.
2000-2005. ITR Project. Derivatives, interfaces, software
with Robert Bryant, Patrice Koehl, Michael Levitt, An-
drew Ban, Johannes Rudolph, Lutz Kettner, Rachel
Brady, and Daniel Filip.
]
[This book is based on notes developed during teaching
the courses on “Sphere Geometry” in the Spring of 2000,
To do or think about (March 15, 2004).
Bio-molecules
This chapter discusses the three main classes of organic We begin by describing the chemical structure of DNA
macromolecules involved in the hereditary and life main- and RNA in Section I.1. We then explain the translation
tenance mechanisms of living beings: DNA, RNA, and from RNA to proteins in Section I.2 and talk about the
proteins. According to the central dogma of biology, pro- structural organization of proteins in Section I.3. Finally,
teins are created in two steps from DNA, which carries the we present some of the fundamental premises and results
genetic information: of molecular mechanics in Section I.4.
1
2 I B IO - MOLECULES
I.1 DNA and RNA indicate the total number of extra shared electrons. For ex-
ample, the hexagonal ring of cytosine has a total of eight
DNA (or deoxyribonucleic acid) is the material that forms covalent bonds, which we may think of as four thirds of a
the genome, which is a complete set of the genetic mate- covalent bond between every contiguous pair.
rial of a living organism. As discovered by Watson and
Crick in 1953, DNA consists of two strands of nucleotides NH2
twisted into the shape of a double helix, as depicted in Fig- C
N
ure I.1. We begin by looking at the small level and work C N
HC
O C CH
N N
−O P O CH2 O
adenine
−O C H C
H
phosphate
H C C H
OH H
deoxyribose sugar
O NH2 O
CH 3 C
C C
N HC N C NH
C NH
HC
C HC C HC C
N C N
N NH2 N O O
our way up the multi-scale structure of DNA. Compared Double helix. The two strands of DNA are held together
to standard genomics texts, the treatment of DNA in this by weak hydrogen bonds between complementary bases,
section is coarse and lacking of many important details. forming the structure of a spiraling staircase. The back-
bone of each strand is a repeating phosphate-deoxyribose
sugar polymer. The phosphate and the sugar groups in the
Chemical structure of DNA. DNA has three chemical
components: phosphate, deoxyribose sugar, and four ni- backbone are connected by phosphodiester bonds. The at-
tachment of these bonds to the sugar groups is illustrated
trogenous bases, namely adenine, guanine, cytosine, and
thymine. The first two bases are double-ring and the last in Figure I.3. The carbons of the sugar group are num-
two are single-ring structures. The chemical components bered from to . One part of the phosphodiester bond is
between the phosphate and the -carbon, and the other is
are arranged in groups called nucleotides, each composed
of a phosphate group, a deoxyribose sugar, and one of the between the phosphate and the -carbon. We think of the
backbone as oriented in the direction of the path that starts
four bases. A nucleotide is conveniently referred to by
at the -carbon, passes through the -carbon, and ends at
the first letter of its base. Figure I.2 sketches the chem-
ical structure of the nucleotide A and shows the chemi- the -carbon. In the double stranded DNA molecule, the
two backbones are in opposite, or anti-parallel, orienta-
cal structures of the remaining three bases. We obtain the
nucleotides G, C and T by substituting the corresponding tion.
base for adenine in Figure I.2. We use boldface edges to The bases are attached to the 1-carbons. Interactions
connect atoms that are joined by two covalent bonds. The between base pairs hold the two strands together. Adenine
covalent bonding in the ring structures of the nitrogenous interacts with thymine and guanine with cytosine. The two
bases is more interesting. All atoms in the ring share elec- bases of a pair are said to be complementary. This implies
trons as a group and we draw some double bonds just to that the sequence of bases along one strand determines the
I.1 DNA and RNA 3
Replication is based on this simple rule of complementar- 3. RNA nucleotides carry the bases adenine, guanine,
ity and makes essential use of the relatively weak bonds and cytosine, but substitute uracil for thymine found
between the two strands. A protein machine builds new in DNA. Uracil forms hydrogen bonds with adenine
DNA strands by separating the two old strands and com- just as thymine does.
plementing each by a new anti-parallel strand.
Figure I.4 illustrates the chemical difference between
RNA and DNA by showing a ribonucleotide containing
Chromosomes. Each cell of an organism contains a uracil.
copy of the entire genome. In the case of a human cell,
this amounts to about two meters of DNA partitioned into
O
twenty-three pairs of chromosomes per cell. The body has
about
cells, totaling about meters of DNA, HC
C
NH
which is more than a hundred times the distance between
O HC C
the earth and the sun. Since humans are small relative to N O
−O P O CH2 O uracil
that distance, this implies that the DNA must be thin and
efficiently packed. Indeed, each chromosome is a long −O C H H C
thread (a double-strand) that is densely folded around pro- phosphate H H
C C
tein scaffolds.
OH OH
How is a long thread of DNA converted into the rel- ribose sugar
atively thick and worm-like structure visible through the
electron microscope? On the lowest level, the DNA is Figure I.4: Chemical structure of the RNA nucleotide with uracil
wrapped twice around a configuration of eight histones (a as the nitrogenous basis.
4 I B IO - MOLECULES
RNA is classified into different types depending on their A gene is thus not only marked but indeed defined by the
function. The vast majority is messenger RNA (or mRNA), promoter segment preceding and the terminating sequence
which acts as an intermediary structure in the synthesis succeeding it.
of proteins. There is also functional RNA produced by a
small number of genes, which is not translated into pro-
tein. Examples are transfer RNA (or tRNA), which brings Bibliographic notes. The idea that traits are hereditary
amino acids to the mRNA during the translation process, is old, but the detailed mechanism how it comes about
and ribosomal RNA (or rRNA), which helps coordinating started to unfold only recently. The groundwork for our
the assembly of amino acids to proteins. current understanding was laid in the nineteenth century
by Gregor Mendel, when he discovered the basic rules of
the hereditary mechanism [2]. An English translation of
Transcription. The transcription process, which makes this work can be found in [3]. It was long known that
RNA, is similar to the replication process of DNA. Dur- DNA is critically involved in that mechanism, but it took
ing the transcription of a gene, the two strands of DNA until the work of Watson and Crick in 1953 to discover the
are separated locally, and one strand acts as a template for chemical structure of DNA [5, 6]. The book by Watson [4]
RNA synthesis. Free ribonucleotides align along the DNA is an enjoyable personal account of the years preceding the
template. The process is catalyzed by another protein ma- discovery of that structure. Today there are many books on
chine, the RNA polymerase complex, which moves along the subject, and most of the material in this section is taken
the DNA adding ribonucleotides to the growing RNA, as from [1, Chapters 2 and 3].
sketched in Figure I.5. The resulting RNA sequence is
[1] A. J. F. G RIFFITH , W. M. G ELBART, J. H. M ILLER AND
S R. C. L EWONTIN . Modern Genetic Analysis. Freeman,
P
U 3’ S P S P S P 5’ New York, 1999.
A G C [2] G. M ENDEL . Versuche über Pflanzen-Hybriden. Verhand-
lungen des naturforschenden Vereines, Abhandlungen,
Brünn 4 (1866), 3–47.
A T C G
[3] C. S TERN AND E. R. S HERWOOD . The Origin of Genetics:
5’ P S P S P S P S 3’
A Mendel Source Book. Freeman, 1966.
Figure I.5: The RNA grows in the 5’ to 3’ direction, in this case [4] J. D. WATSON . The Double Helix. Antheneum, New York,
by adding a nucleotide carrying uracil to the chain. 1981.
the same as the non-template sequence of the gene, except [5] J. D. WATSON AND F. H. C. C RICK . Molecular structure
that U replaces T. Electron microscope pictures show that of nucleic acid. A structure for deoxyribose nucleic acid.
Nature 171 (1953), 737–738.
the transcription of DNA to RNA is a highly parallel pro-
cess in which a row of RNA polymerase complexes follow [6] J. D. WATSON AND F. H. C. C RICK . Genetic implica-
each other along the gene and produce RNA concurrently. tions of the structure of deoxyribonucleic acid. Nature 171
Each individual transcription works in three steps. (1953), 964–967.
I.2 Proteins and Amino Acids Amino acids. Among a much larger variety of amino
acids, nature uses only twenty to build proteins. We
Proteins are polypeptide chains obtained by translation list their names together with their three-letter codes and
from strands of messenger RNA. In this section, we sketch single-letter abbreviations in Table I.1. As can be seen in
the translation process and discuss the chemical structure
Alanine Ala A Methionine Met M
of proteins.
Cysteine Cys C Asparagine Asn N
Aspartate Asp D Proline Pro P
Chemical structure. A protein is a linear sequence of Glutamate Glu E Glutamine Gln Q
Phenylalanine Phe F Arginine Arg R
amino acids connected to each other by peptide bonds.
Glycine Gly G Serine Ser S
Each amino acid consists of a central carbon atom, the -
Histidine His H Threonine Thr T
carbon, linked to an amino group, a carboxyl group, one Isoleucine Ile I Valine Val V
hydrogen atom, and a side-chain. Amino acids that are Lysine Lys K Tryptophan Trp W
linked into a polypeptide chain are referred to as residues. Leucine Leu L Tyrosine Tyr Y
Different residues are distinguished by their side-chains.
As shown in Figure I.6, two amino acids are linked by a Table I.1: Names, codes and abbreviations of the twenty amino
peptide bond whose creation releases water. The result- acids that occur as building blocks of natural proteins.
ing repeating sequence of nitrogen, -carbon and carbon
atoms is the backbone of the protein. Figures I.8 and I.9, residues differ widely in size and struc-
ture. The fifteen amino acids sketched in Figure I.8 may
be viewed as trees rooted at the -carbon, which is part
H H
H O H O of the backbone. Most of the internal nodes are carbon
N C C + N C C atoms, with rare occurrences of oxygen, nitrogen and sul-
H OH H OH fur atoms. As before, we mark double and partially dou-
R R
ble bonds by boldface edges. Four of the five amino acids
OH2
H O H
H O
N C C N C C
H OH
R H R
O N
Valine Isoleucine
Figure I.6: Two amino acid residues joined by a peptide bond.
Leucine Asparagine
N
NH2 COOH COOH NH 2 S
N O N
N N
O O
Cα Cα
Arginine Lysine Methionine Glutamate Glutamine
H R R H
Figure I.8: The fifteen amino acids without cycle in their chemi-
L D cal structure. The shaded circle is the -carbon on the backbone.
structures. The fifth amino acid is proline, which forms a The translation is accomplished by transfer RNA
cycle by having its chain connect back to the nitrogen next molecules that recognize codons through the same binding
to the -carbon along the backbone. This unique feature mechanism used for replication and transcription. Some
locally restricts the flexibility of the backbone, as will be residues correspond to more codons than others. The re-
discussed in Section I.3. dundancy is in part due to multiple tRNA molecules car-
rying the same residue and in part because there is flexi-
bility in how the tRNA reads the codons. In many cases,
N an accurate match at the first two positions suffices and a
mismatch at the third position can be tolerated. This ex-
plains the relative uniformity among the four residues in
Proline
any one slot of Table I.2.
N
Since codons are triplets of nucleotides, there are ap-
Tryptophan
parently three possible reading frames, each producing an
entirely different residue sequence. The correct reading
frame is identified by starting the translation always at a
start codon, AUG. The initiator tRNA is a specific transfer
O RNA that recognizes this sequence and binds to methion-
N ine. Incidentally, it differs from the tRNA that binds to the
O
N
AUG codon in the middle of the sequence, although that
Tyrosine Phenylalanine Histidine one also binds to methionine.
Figure I.9: The five amino acids with cyclic chemical structure.
Translation. As mentioned above, the tRNA molecules
are instrumental in translating codons into residues. Each
tRNA is a short sequence of about 80 nucleotides. Com-
Genetic code. The translation process is more involved plementary subsequences form double-helix substructures
than transcription because it converts information between that further fold up to characteristic ‘clover leaf’ forma-
two languages that use different alphabets. The sequence tions, one of which is sketched in Figure I.10. A tRNA
of nucleotides is read consecutively in groups of three,
called codons. Since there are four different types of nu-
cleotides, we have codons. There are only twenty amino
acid
residues, which implies that the map is not injective but
3’
uses redundancy to reduce the number of outcomes. The
complete map is shown in Table I.2. The codon XYZ is
5’
A G C U G C
C G
A Lys Lys Arg Arg Thr Thr Ile Met G C
G C
Asn Asn Ser Ser Thr Thr Ile Ile A U
G Glu Glu Gly Gly Ala Ala Val Val U A
U A
Asp Asp Gly Gly Ala Ala Val Val G A C A C
C Gln Gln Arg Arg Pro Pro Leu Leu C U C G
C U G U G
His His Arg Arg Pro Pro Leu Leu G A G C
mapped to one of the residues in the row of X and the col- Figure I.10: Transfer RNA with anti-codon at the bottom, cova-
umn of Y. The four positions inside that slot correspond to lently attached amino acid at the top, and complementary sub-
A, G in the first row and C, U in the second row. strings shown.
I.2 Proteins and Amino Acids 7
molecule matches the exposed codon of the mRNA with [4] N. J. DARBY AND T. E. C REIGHTON . Protein Structure.
its anti-codon and contributes its residue to the polypep- Oxford Univ. Press, England, 1993.
tide chain that grows at the other end. The codon and anti-
[5] P. C. E. M OODY AND A. J. W ILKINSON . Protein Engi-
codon are matched in anti-parallel orientation, as always. neering. Oxford Univ. Press, England, 1990.
The translation process is facilitated by the ribosome,
[6] L. S TRYER . Biochemistry. Third edition, Freeman, New
which is a large complex made from more than 50 dif-
York, 1988.
ferent proteins and several RNA molecules. It consists
of a small subunit and a large subunit, which come to-
gether around an mRNA strand with the help of the ini-
tiator tRNA that contributes the first residue. The ribo-
some scans through the strand like a tape reader. For each
codon, it finds a tRNA with matching anti-codon and ap-
pends its amino acid as a residue to the carboxyl end of the
growing polypeptide chain. The orientation of the mRNA
strand from the 5- to the 3-end is thus preserved by the
orientation of the polypeptide chain from the amino group
of the first to the carboxyl group of the last residue. The
translation process ends when a stop codon is read. The
protein chain and the mRNA are released and the ribo-
some dissociates into its two subunits.
Similar to transcription, the translation of an mRNA
strand into a protein happens in parallel, with several ri-
bosomes working concurrently and in sequence along the
strand. In some cases, the translation even starts during
transcription, before the mRNA strand is complete.
Cα
φ
O N
C H
ψ H
H Figure I.12: The square represents all angle pairs and the
Cα
N shading indicates the region of disallowed pairs for glycine.
φ
Cβ
glycine is only H, which is the reason that a relatively
C
large portion of the square of angle pairs is realizable. An
Cα
O interesting residue in this respect is proline, which differs
from all others because it binds back to the backbone, and
Figure I.11: The planarity of a peptide bond is caused by its in this way restricts the rotational degree of freedom to a
partial double-bond character. The and angles measure rota- small region.
tions around the bonds preceding and succeeding every -carbon
atom.
Two common motifs. A motif that is commonly ob-
bond character, there is no freedom to rotate around the served in proteins is the -helix, whose backbone forms
peptide bond, which is the link between the carbon and the a right-handed helix. Contiguous -carbons are separated
nitrogen atoms. There are however two possibly planar by about
in the rotation direction and Å rise,
configurations: the trans form, in which C -C-N-C is
which is measured along the axis. A rotation takes about
residues and produces an axial separation of about
it curves in one direction (zig-zig). The two forms are Å. The structure is stabilized by hydrogen bonds be-
distinguished by the rotation angle along the C-N bond, tween every CO group and the NH group four residues
, which by convention is for the trans and for
later. All side-chains lie outside the helix structure. The
the cis form. In contrast, the links between the -carbon characteristic dihedral angles for a right-handed -helix
and the carbon and nitrogen atoms are single bonds with are roughly and . Cartoon repre-
one-dimensional rotational degrees of freedom. As shown sentations of protein structures usually draw -helices as
in Figure I.11, measures the rotation around the N-C
of the ribbon.
Again by convention,
and for the two
coplanar trans forms. Another recurring motif are -sheets, which are flat and
made up of several strands. A strand can be obtained by
stretching the -helix until the axial distance between two
Ramachandran plot. The conformation of the back- contiguous -carbons reaches about Å. The stabilizing
bone is completely determined when , , and are spec- hydrogen bonds are between neighboring strands, which
ified for each residue in the chain. A given residue pro- can run in the same direction (parallel) or in opposite di-
hibits some angles because of steric hindrances, which rections (anti-parallel). They combine strands to sheets.
I.3 Structural Organization 9
I.4 Molecular Mechanics the Avogadro’s number of its atoms. In other words, if the
mass of one atom of that element is daltons then the
After a protein has been created by translation, it folds mass of one mole is grams. Table I.4 lists properties of
into a shape, or conformation, that is determined by its elements that are commonly found in organic matter.
sequence of residues. The folding process is a reaction to
element #p #n electron shells
a multitude of forces that simultaneously act on every part
of the protein. This section presents some of the current Hydrogen H 1 0 .
Carbon C 6 6 .. ....
knowledge and efforts to model these forces. We begin
Nitrogen N 7 7 .. .....
by studying atoms and discuss covalent and non-covalent
Oxygen O 8 8 .. ......
forces. Sodium Na 11 12 .. ........ .
Magnesium Mg 12 12 .. ........ ..
Phosphorus P 15 16 .. ........ .....
Atoms. Each atom has a positively charged massive Sulfur S 16 16 .. ........ ......
nucleus, which is surrounded by a cloud of negatively Chlorine Cl 17 18 .. ........ .......
charged electrons. The nucleus consists of protons, each Potassium K 19 20 .. ........ ........ .
contributing a unit positive charge, and of electronically Calcium Ca 20 20 .. ........ ........ ..
neutral neutrons. The electrons are held in orbit by elec-
trostatic attraction to the nucleus. Each electron has one Table I.4: Some elements together with their numbers of pro-
unit of negative charge, which exactly neutralizes the pos- tons, neutrons and electrons distributed in the shells around the
nucleus.
itive charge of one proton. In total, we have the same
number of protons and electrons and thus an electroni-
cally neutral atom, as illustrated in Figure I.16. Different
Covalent bonds. According to the Born model, elec-
trons live in shells around the nucleus and populate in-
- - - - ner shells before using outer ones. The first three shells
from inside out can hold up to 2, 8 and 8 electrons, as in-
+ dicated in Table I.4. The chemical properties of an atom
+ +
+ + are defined by the tendency to either empty or complete
+ + its partially incomplete shell, if any. One way of doing
that is by sharing electrons. The shared electrons com-
- - -
plete the outermost non-empty shells of both atoms in-
volved. According to Table I.4, carbon, nitrogen and oxy-
gen need four, three and two electrons to fill their outer
Figure I.16: A schematic picture of a hydrogen atom to the left
and a carbon atom to the right. shells. As illustrated in Figure I.17, this can for exam-
ple be done by covalently binding to the same number
of hydrogen atoms. We can now define a molecule as a
elements consist of atoms with different numbers of pro-
tons. The atomic number is by definition the number of
protons, which is also the number of electrons. The num- - -
ber of neutrons is usually about the same because too few
or too many neutrons destabilize the nucleus. The atomic + +
+
weight is the ratio of its mass over the mass of a single
hydrogen atom. Because the mass of an electron is negli-
+ +
gible, the atomic weight is almost exactly the number of
protons plus the number of neutrons.
Figure I.17: The geometry of covalent bonding for carbon, nitro-
Avogadro’s number is useful in translating from the gen, and oxygen.
miniscule world of single atoms into a humanly more ac-
cessible scale. It is the number of hydrogen atoms in one
gram of hydrogen, which is roughly
. The mass
connected component of the graph whose vertices are the
atoms and whose edges are the covalent bonds. When an
of one hydrogen atom is therefore
gram which, atom covalently bonds to more than one other atom, then
by definition, is one dalton. One mole of an element is there is a preferred angle between pairs of bonds. For ex-
12 I B IO - MOLECULES
ample for carbon, this angle is what we get by connecting der Waals interaction. Experimental observations point to
the centroid of a regular tetrahedron with two of the ver- a potential energy function roughly as graphed in Figure
tices.
. Two atoms can
Using elementary geometry we find this angle is I.18. The corresponding force is the negative derivative,
energy
clei closer together and is stronger than the corresponding
single bond. It also prevents any torsional rotation around
that bond, which is possible for single bonds. We need
a sequence of four atoms and three covalent bonds to de-
fine the torsional angle of the middle bond. It is gener-
ally parametrized such that corresponds to the trans distance
(zig-zag) coplanar configuration. For example for H C-
CH , we have three bonds on each side of the middle
bond. There is an energetic preference for staggering the
covalent bonds on the two sides, which corresponds to tor-
sional angles of ,
, and . Figure I.18: The van der Waals force is obtained by adding the at-
tractive force (derivative of dashed curve) and the repulsive force
When two atoms that covalently bond are of different (derivative of the dotted curve).
type then they generally attract the shared electron to dif-
ferent degrees. The shared electrons will therefore have a which is interpreted as a balance between an attractive
bias towards one end of the structure or another. We then and a repulsive force. The attraction is due to a disper-
have a polar structure in which the positive charge is con- sive force that can be explained using quantum mechanics.
centrated on one end and the negative charge on the other. The repulsion also has a quantum mechanical explanation
Examples of polar covalent bonds are between hydrogen in terms of the Pauli principle, which prohibits any two
and oxygen and between hydrogen and nitrogen, as illus- electrons from having the same set of quantum numbers.
trated in Figure I.17. In contrast, the bond between hy-
drogen and carbon has the electrons attracted much more It is useful to keep the relative strengths of the various
equally and is relatively non-polar. forces in mind. Table I.5 gives estimates of the amount of
energy necessary to break one mole of bonds.
simplest such model sums five contributions to the poten- It is clear that
as defined is only a rough approxima-
tial energy, three accounting for covalent bonds and two tion of the real potential energy that drives the behavior
for non-covalent bonds. We use a vector to de- of the system. Whether or not that approximation suffices
scribe the state of a system of atoms and define the po- depends on what we use it for.
tential energy as a function . In its simplest
form, that energy is written as
Molecular dynamics. One of the applications of force
fields is the simulation of molecular motion. Let
*+
bonds
-* , .02. /1 ,
be the trajectory of a point with mass . Its location at
, * ,
time is , its velocity is
, and its momen-
angles
tum is 3-* ,
. Recall Newton’s three laws of motion:
1. A body continues to move in a straight line at con-
torsions
stant velocity unless a force acts upon it.
atoms 2. The rate of change of the momentum equals the force.
"$# # )(
! !
3. To every action there is an equal and opposing reac-
!&% ' %
tion.
atoms
4 , 65* , 7.90. 8:1 8/ ,
The rate of change of the velocity is also referred to as
the acceleration, . Newton’s sec-
This formula contains various constants that depend on the
ond law can now be written as ;4 , /=<>1 ?
@/ <>1 ?
* ,
type of atom or interaction involved. We briefly look at , where
each one of the five terms. is the force acting upon . Suppose we
* 5 , BAE /=<>1 ?
* GFH*IF
be computed analytically. For example, if the potential is
Bond angle. The second sum approximates the energy
stationary and equal to one over the norm,
/ BAC / J* GFH*KF
,
penalty for differing from the reference angle, ,
then . In this case, the generic
again by a quadratic function. The strength, , is
considerably less than for bond length, namely about trajectory is an ellipse with one focus at the origin, as illus-
one one-hundredth or even less. trated in Figure I.19. Both the gravitational and the elec-
trostatic potentials have this form.
Torsional rotation. The third sum approximates the en-
ergy for different torsional angles around a bond. An-
gles that lead to staggered arrangements of bonds at
both sides are energetically preferred. This prefer-
ence is modeled by a cosine function with minima
and the same number of maxima.
Electrostatic interaction. The forth sum adds the electro-
static potential between every pair of atoms in the
system. The constants and are the charges, is
the dielectric constant of the medium, and is the
distance between the two atoms.
Van der Waals interaction. The fifth sum approximates Figure I.19: A generic trajectory when the magnitude of the at-
!
the van der Waals potential by the Lennard-Jones 12-
traction to the origin decreases with the square distance.
6 function. The collision constant,
, marks where
the function crosses the zero line, and
value at the unique minimum. As before,
is the
is the
The problem in molecular dynamics is significantly
more involved. We have bodies (atoms) and the energy
distance between the two atoms. potential and force depend on the momentary locations of
14 I B IO - MOLECULES
all bodies. As before, we
represent the collection of putational biology. Numerical algorithms for molecular
atoms by a point . The energy potential is the dynamics can be found in Leach [4] and Schlick [6].
BAC
function defined earlier, and the force act-
ing on is . Newton’s second law of motion [1] N. W. A SHCROFT AND N. D. M ERMIN . Solid State
can now be written as Physics. Harcourt Brace, Orlando, Florida, 1976.
5 BAC
[2] A. B ONDI . Molecular Crystals, Liquids and Glasses. Wiley,
where the mass vector
New York, 1968.
multiplies each compo- [3] W. L. J ORGENSEN AND J. T IRADO -R IVES . The OPLS po-
nent of the acceleration vector with the mass of the corre- tential functions for proteins. Energy minimization for crys-
sponding atom. The classic two-body problem is the spe- tals of cyclic peptides and crambin. J. Amer. Chem. Soc. 110
cial case in which and is the sum of the two (1988), 1657-1666.
corresponding gravitational potentials. In this case, the
[4] A. R. L EACH . Molecular Modeling. Principles and Appli-
generic trajectories are again ellipses. Already for three
cations. Longman, Harlow, England, 1996.
bodies, there is no analytic solution and one has to resort
to numerical methods to approximate the trajectories. The [5] F. L ONDON . Zur Theorie und Systematik der Moleku-
problem in molecular dynamics is even more difficult be- larkräfte. Zeitschrift für Physik 63 (1930), 245–279.
cause the potential function is considerably more compli-
[6] T. S CHLICK . Molecular Modeling and Simulation.
cated than a sum of gravitational potentials. The currently
Springer-Verlag, New York, 2002.
available numerical solutions are inadequate to simulate
the entire folding process even for small proteins. One of [7] J. T SAI , R. TAYLOR , C. C HOTHIA AND M. G ERSTEIN .
the difficulties in the simulation is the near cancellation of The packing density in proteins: standard radii and volumes.
large forces so that relatively weak residuals gain a deci- J. Mol. Biol. 290 (1999), 253–266.
sive influence. Even small inaccuracies in the model or the
computation can lead to false decisions and possibly spoil
the entire remainder of the simulation.
(i) How many different linear pieces of double-
(ii) Determine the solid angle formed by three faces
stranded DNA of length are there?
meeting at a common vertex.
(ii) How many different cyclic pieces of double-
stranded DNA of length are there? [By convention, the full dihedral angle is
, which
is the length of the unit circle, and the full solid angle
[Beware of palindromic sequences.] is , which is the area of the unit sphere.]
3. Amino Acids. Draw the graph whose nodes are the
* FH*IF
8. Elliptic Trajectory. Let the energy potential
acyclic amino acids that has an arc connecting two
/ BAC / J* GFH*KF
be defined by
* . The force it
nodes iff one amino acid can be obtained from the exerts on a point is .
other by the replacement or addition of a single atom. Prove that the generic trajectory in this force field is
an ellipse centered at the origin.
(i) Is the graph connected?
(ii) Does every connected component have a path
that passes through every node exactly once?
*
sisting of all points with integer coordinates
*
whose sum is even: such that
.
(ii) The body-centered cube (or BCC) lattice con-
*
sisting of all points will all even or all odd
*
integer coordinates: such that
*
or .
Geometric Models
A surprising finding in the research on proteins is the so, we develop a language suitable for studying details of
importance of geometric shape in their functioning. By our models. In Section II.3, we introduce alpha shapes,
and large, the shape seems to determine how proteins in- which are dual to space-filling diagrams and are our pre-
teract with each other and with other molecules. This find- ferred computational representation. Finally in Section
ing is usually expressed as a causal chain of responsibili- II.4, we talk about the Alpha Shape software and discuss
ties: how it can be used.
S EQUENCE S HAPE F UNCTION
A protein is a peptide chain of amino acids that folds up II.1 Space-filling Diagrams
and forms a shape. In a natural environment, like proteins II.2 Power Diagrams
fold up to same shapes, but this might be a result of evolu- II.3 Alpha Shapes
tionary selection. The details of that shape in terms of its II.4 Alpha Shape Software
cavities, protrusions, dynamics, and energetics determine Exercises
how it interacts with other molecules.
At the current stage of our biological knowledge, there
is an overwhelming accumulation of sequence informa-
tion, which is due, in part, to the near completion of sev-
eral large-scale genome projects. Although the number
of proteins for which the three-dimensional structure has
been resolved and is stored in the Protein Data Base is in
the thousands, this is only a small fraction of the wealth of
available sequence information. The goal of studying the
geometry of proteins is therefore two-fold: the develop-
ment of new computational tools to help determine or re-
fine structure information and understanding the relation-
ship between shape and function.
In this chapter, we introduce some of the basic geomet-
ric models useful in representing molecular shape. We
have seen the bio-chemist’s view in Chapter I, who aims
at pruning the immense variety by limiting attention to
physically or chemically likely configurations. The rest
of this books takes a complementary view by concentrat-
ing on mathematical models and computational data struc-
tures that arise in the study of proteins. In Section II.1, we
introduce space-filling diagrams as the primary geometric
model of molecules. In Section II.2, we use Voronoi dia-
grams to decompose space-filling diagrams, and in doing
17
18 II G EOMETRIC M ODELS
Union of disks. Let be a finite set of disks in the Eu-
we denote as . We specify each
clidean plane, which
and its radius
disk
by its center
. An example is shown in Figure II.1. The union
we
denote as . Similar to the two-dimensional case,
and angles in the Delaunay triangulation, which will be
explained in Section II.2.
specify each ball
by its center and
its radius . Figure II.3 shows the union of balls that
represent gramicidin, which is a small protein of barely
Rolling circle. We can make the boundary of the disk more than 300 atoms. To understand the structure of the
union smoother by substituting blending curves for the boundary of the union, , we study the portion con-
vertices where the circular arcs meet. To this end we tributed by a single sphere. The sphere bounding in-
roll a circle of radius on the outside about the bound- tersects the other balls in a finite collection of caps. The
ary. At any moment during the motion, the circle touches interior of each cap lies in the interior of the union, and
the boundary but never intersects the interior. The cen- the portion of the sphere not covered by any cap is the
II.1 Space-filling Diagrams 19
spheres, which are common for proteins, are much smaller
and typically only a constant times .
Rolling sphere. We can again get a smoother bound-
ary by rolling a sphere of radius about
. The cen-
ter of that sphere moves along the boundary of the union
of grown balls, , and its front sweeps out blend-
ing surfaces that cover cusps and crevices of the origi-
nal boundary. Figure II.4 shows such a rounded surface
union of balls in can be quite a bit higher than the
same numbers for a union of disks in . To count the
faces, arcs and vertices, we first note that a single sphere
intersects the other balls in fewer than caps. By analogy
to disks in the plane, the number of arcs in the bound-
ary of the union of caps is less than . Since each arc
has at most two endpoints (if it is a full circle then it has
vertices. To count the faces
no endpoints) and each endpoint belongs to two arcs, we
also have no more than
contributed by our sphere, we recall that these are the
connected components of the complement of the union of
caps. We will see that these components are related to the
triangles of the Delaunay triangulation, which implies that
Figure II.4: A molecular surface representation of the gramicidin
there are fewer than faces on this one sphere. To get protein.
bounds on the total number of faces, arcs and vertices, we
multiply by and note that each arc belongs to at least
two and each vertex belongs to at least three spheres. We In the application of space-filling diagrams to biology,
conclude that there are fewer than faces, fewer than
@
the radii of the balls are usually the van der Waals radii
arcs, and fewer than vertices. It can be shown
of the atoms, and the boundary of is referred to as
that for each value of , there are configurations of balls the van der Waals surface. The radius is chosen so that
with at least some constant times faces, edges and ver- the rolling sphere approximates a water molecule, and the
tices. This shows that the upper bounds are asymptoti- boundary of is referred to as the solvent accessible
20 II G EOMETRIC M ODELS
surface. The rounded surface is usually referred to as the
star-shaped and that lies in its kernel. Since is the
molecular surface.
common intersection of the , , this im-
plies that
is also star-shaped and that lies also in its
* FH* F
distance of a point from equal to the Euclidean
balls, the boundary of the union sweeps out the Voronoi
distance minus the weight: . The
cell of is the set of points at least as close to as to any diagram, and we get a structural re-arrangement whenever
we sweep over a vertex of the Voronoi diagram.
other weighted point,
* * *
Bibliographic notes. Space-filling diagrams have a long
tradition in biochemistry and are similar to the CPK me-
Figure II.5 illustrates the definition in two dimensions.
Consider the case of two weighted points, and , and chanical models named after Corey, Pauling and Koltun
[5, chapter 1]. The variations of these models discussed
in this section have been introduced by Lee and Richards
[6, 7]. The molecular surface is sometimes referred to as
the Connolly surface, named after Michael Connolly who
wrote early software constructing this surface [3]. The sol-
vent accessible surface in Figure II.3 and the molecular
surface in Figure II.4 are computed using the software de-
scribed in [1].
Increasing all radii of a set of circles or spheres contin-
uously and at the same rate is referred to as the Johnson-
Mehl model of growth [4]. It leads to the Voronoi diagram
of this section, which is sometimes referred to as the addi-
tively weighted Voronoi diagram. We refer to Aurenham-
Figure II.5: Two-dimensional Voronoi diagram generated by uni- mer [2] for a survey of Voronoi diagrams, their algorithms
formly growing the disks. and applications. An algorithm that computes cells of the
let
!
be the set of points with . If one * *
additively weighted Voronoi diagram in
veloped and implemented by Will [8].
has been de-
*
is empty. Otherwise, we have two non-empty cells sep-
[1] N. A KKIRAJU , H. E DELSBRUNNER , P. F U AND J. Q IAN .
arated by a two-dimensional membrane. The points of
Viewing geometric protein structures from inside a CAVE.
this membrane satisfy
IEEE Comput. Graphics Appl. 16 (1996), 58–61.
F* F F*
F [2] F. AURENHAMMER . Voronoi diagrams — a study of a fun-
damental geometric data structure. ACM Comput. Surveys
Growing square radii. As in Section II.1, we let be
a finite set of balls
. The square of the radius, tance from two balls form a plane. The two planes are
1 , may separate the two bounding spheres, intersect both, or
,
We grow each ball to radius at time . The lie on the same side of both. Think of the three configura-
set of balls at time is denoted as . The Taylor series
expansion of the radius as a function of time is
, , ,
1
,, tions as snap-shots in an animation in which the center of
the small circle moves towards the center of the large cir-
cle. At first, the line moves in the same direction but then
We are interested in the surface swept out by the intersec-
comes to a halt and reverses its direction moving away
* ,
tion of the spheres bounding and and claim it is a
F , ,
from the center of the large circle.
F* FH* F
plane. The points that belong to both spheres at time
.
,
satisfy
Varying has the same effect as dropping the requirement Power diagram. The power or (weighted) Voronoi cell
that the two expressions vanish. Instead we just require
of a ball under the power distance is the set of points at
that they both be equal, so we get least as close to as to any other ball,
FH* F , HF *
F ,
* * *
*
:* *
:*
*
* F F F F
If we denote by the set of points whose power dis-
!
tance from is at most as large as the power distance from
then . In words, is the intersection of
We see the circle at which the two spheres intersect sweeps a finite number of half-spaces and thus a convex polyhe-
1
out a plane. If follows that the membranes swept out by dron. This polyhedron may be bounded or unbounded,
the arcs of are pieces of planes. and it is even possible that it is empty. The power or
(weighted) Voronoi diagram of is the collection of cells
together with the polygons, edges, and vertices shared
Power distance. We can describe the decomposition of by the cells. Every polygon is shared by two cells, and in
space implied by the square radius growth model as a the generic case every edge is shared by exactly three and
Voronoi diagram for yet another weighted distance func- every vertex is shared by exactly four cells. Figure II.7 il-
*
tion. The appropriate function in this case is the power lustrates the definitions in two dimensions by showing the
*
distance of a point from a ball defined as the square Voronoi diagram of the same eight disks used in earlier
distance from the center minus the weight, figures.
II.2 Power Diagrams 23
nating sum of simplices is always equal to 1. Writing ,
, and for the numbers of vertices, edges, triangles
Figure II.7: Power or weighted Voronoi diagram of eight disks
and tetrahedra, we have
in the plane.
Delaunay triangulation. The (weighted) Delaunay tri-
Before counting the simplices in three dimensions, let us
angulation of is dual to the (weighted) Voronoi dia-
by connecting and by an edge
warm up to the challenge by counting the simplices of a
gram. It is obtained
if the cells and share a common polygon. Similarly,
two-dimensional Delaunay triangulation. The Euler rela-
tion here is . Observe that every triangle
, and are connected by a triangle if , and
share a common edge, and , , and are connected
@
has three edges and every edge belongs to at most two tri-
by a tetrahedron if , ,
and share a common ver-
angles, hence
the Euler relation implies
and
. Combining this inequality with
. The
tex. Assuming the balls in are in general position, this
exhausts all possible types of overlap among the Voronoi
number
of vertices
@
is
, and .
at most
the number of
@
disks, hence
cells. Since complexes of tetrahedra are difficult to draw,
we illustrate the definitions by showing a two-dimensional In three dimensions, we note that each tetrahedron has
Delaunay triangulation in Figure II.8. If the balls are not @
four triangles and each triangle belongs to at most two
in general position, we can perturb them ever so slightly tetrahedra, hence
=
. Combining this with
the Eu-
to move them into general position. ler relation implies and
number of vertices is at most the number of balls,
. The
,
and the number
of
edges is at
most the number of pairs of
vertices, . Hence
edges in the Delaunay triangulation is at most some con-
Observe that we reverse dimensions when we go from stant times , and as a consequence, also the number of
the Voronoi diagram to the Delaunay triangulation: cells triangles and tetrahedra are at most some constant times
become vertices, polygons become edges, edges become .
24 II G EOMETRIC M ODELS
all have zero radius. Then each Voronoi vertex is equally lation. The half-line passes through a sequence of Delau-
far from four points and coincides with the center of the nay tetrahedra,
, and we have and
circumsphere of these points. We will use the concept of for some . Any two consecutive tetrahedra
orthogonality to generalize this property to the case where
share a triangle. It follows that the orthospheres
the have not necessarily zero and not necessarily equal of and of are orthogonal to the three balls
radii. Two spheres or balls
and
whose centers span that triangle. The plane of points with
are orthogonal if equal power distance from and thus contains the
F F
shared triangle. The viewpoint is on ’s side of that
plane, which implies that the power distance of from
is less than that from . By transitivity, the power dis-
The name is justified because the two tangent planes de-
tance of from the orthosphere of is less than its power
fined at any point common to the bounding spheres of
distance from the orthosphere of , whenever , and
and form a right angle between them. the same is true for and . In other words, the power
* distance increases along chains of the relation . Since
*
Let now be a vertex of the Voronoi diagram of .
Assuming the generic case, has equal power distance real numbers are totally ordered, we conclude that is
acyclic.
* * *
from four balls, , , and , and larger power distance
* * *
from all others. Let be the sphere with center
and weight .
Algebraically, there is no difficulty at all if is negative
and is therefore imaginary. That sphere is orthogonal to
Bibliographic notes. Power diagrams of discrete sets of
weighted points have been studied by Carl Friedrich Gauss
, ,
and , and we refer to it as the orthosphere of more than 150 years ago in the context of quadratic forms
the four balls. If the four balls had zero radius, would be [6]. In reference to subsequent work by Dirichlet [3] and
F
Voronoi [8], these diagram are often referred to a Dirichlet
F*
their circumsphere. Note that is further than orthogonal
:
from all other balls, that is,
for all
. This property can be used to characterize
tessellations or Voronoi diagrams. The dual triangulations
have been introduced considerably later by Boris Delau-
Delaunay tetrahedra for a generic set of balls. Specifically, nay (also Delone) [2]. It is common to reserve the name
a tetrahedron connecting points , , and belongs Delaunay triangulation for unweighted points and to refer
to the duals of power diagrams as regular triangulations [1]
to the Delaunay triangulation of iff the orthosphere of
, ,
and is further than orthogonal from all other or coherent triangulations [7]. We prefer to be economi-
cal with terms and refer to them as (weighted) Delaunay
balls in .
triangulations. Algorithms for constructing weighted De-
launay triangulations in and are discussed in [4,
Acyclicity. Given a fixed viewpoint, we can order two Chapters I and V]. That reference also explains how to
tetrahedra if one lies in front of the other one, as seen from computationally cope with ambiguities in the construction
the viewpoint. We call this the visibility ordering with re- caused by non-generic input sets. Upper bounds on the
spect to the given viewpoint. It turns out that this relation
number of Delaunay simplices for “well-spaced” points in
can in general have cycles but is acyclic for Delaunay tri- can be found in [5].
angulations. We need some notation. Let be the
viewpoint and write if there is a half-line that em- [1] L. J. B ILLERA AND B. S TURMFELS . Fiber polytopes. Ann.
anates from and passes through the interior of the De- Math. 135 (1992), 527–549.
launay tetrahedron before it passes through the interior
the Delaunay tetrahedron . We use orthospheres to prove [2] B. D ELAUNAY. Sur la sphère vide. Izv. Akad. Nauk SSSR,
that the relation is acyclic. Otdelenie Matematicheskii i Estestvennyka Nauk 7 (1934),
793–800.
ACYCLICITY L EMMA . The visibility ordering of the De-
[3] P. G. L. D IRICHLET. Über die Reduktion der positiven
launay tetrahedra with respect to any fixed viewpoint
quadratischen Formen mit drei unbestimmten ganzen Zahl-
is acyclic. en. J. Reine Angew. Math. 40 (1850), 209–227.
P ROOF. Let be a half-line that emanates from and [4] H. E DELSBRUNNER Geometry and Topology for Mesh
passes through the interiors of and . We may assume Generation. Cambridge Univ. Press, England, 2001.
II.2 Power Diagrams 25
tion pattern. We first discuss this pattern for general sets
Dual complex. Observe that the Voronoi cells decom-
that are not necessarily balls. Call of a collection of sets
pose the union of balls in into convex cells
independent if for every subcollection there is a
. Let be a subset of the index set. point inside every set in and outside every set not in :
The dual complex records the non-empty common inter-
sections among these cells,
has
A collection of size subcollections. For this
where is the convex hull of the centers of the balls with
collection to be independent, there must be points
index in . Equivalently,
iff the common inter- whose patterns of inclusion in the sets are pairwise
different. We use the pigeonhole principle to show that
section of Voronoi cells has a
non-empty
the union of balls:
intersection with
. Note that this
the maximum number of independent disks in the plane is
three. Let be the maximum number of regions
is just a more formal way of explaining the duality trans-
we can
get by
drawing
circles in the plane. We have
formation we used in the last section to construct the De- and because the -st
launay triangulation from the Voronoi diagram. The un- circle intersects the other circles in at most two points
derlying space is the set of points contained in simplices
of . In this context, we refer to it as the dual shape of
each. These points cut the
-st circle into at most
arcs, and each arc cuts at most one region into two. The
. Figure II.9 illustrates the definition for the set of disks number of regions is therefore
#
used in many of the previous figures.
%
Hence, , which implies that at most three
disks can be independent. For each there
is a (combinatorially) unique independent configuration
shown in Figure II.10. The same argument also works
looks like the ball-and-stick diagram common in chem- in three dimensions, where it can be used to show that
istry and biology. There, each stick represents a covalent the maximum number of independent balls is four. Again,
bond, while here, it represents the geometric overlap be- there is only one possible intersection pattern for four in-
tween two balls. dependent balls.
II.3 Alpha Shapes 27
Independent simplices. Recall that each simplex in the independent caps. But this implies that the Voronoi vertex
Delaunay triangulation is spanned by the centers of a small lies outside the sphere: .
collection of balls, four for a tetrahedron, three for a trian-
As mentioned above, the Independence Lemma also
gle, and so on. In discussions of combinatorial properties, holds for three disks in the plane. Given three balls, we get
we sometimes forget the difference and think of the sim-
three disks of maximum size by intersecting them with the
plex as this collection of balls. In this spirit, we call the plane that passes through the centers. This plane intersects
simplex independent if the collection of balls is indepen- the Voronoi diagram of the balls in the Voronoi diagram of
dent. We will prove shortly that all simplices in the dual
the disks. But this implies that three balls are independent
complex are independent. This is a fairly strong statement iff the (unique) line in the corresponding Voronoi diagram
since it limits the balls to a single intersection pattern. The
has a non-empty intersection with the union of the three
following lemma is the key to proving that all simplices in
balls. Similarly, two balls are independent iff the (unique)
the dual complex are independent. The lemma holds in plane in the corresponding Voronoi diagram has a non-
any dimension, and can be proved by induction over the
empty intersection with their union. But this is exactly the
dimension. To avoid the complications of a discussion for
general dimensions, we assume the lemma for disks (or
criterion for a simplex to belong to the dual complex. It
follows that each simplex in is independent, as claimed.
rather for caps on a sphere) and prove it for balls in .
I NDEPENDENCE L EMMA . A collection of four balls in
Filtration. We return to the idea of growing the balls
is independent iff the (unique) vertex of the
corresponding Voronoi diagram is contained in the
continuously and watch how the union changes. We let
,
time go from to and grow the weight of each ball
, ,
union: .
to at time . Each has zero weight at time
P ROOF. Assume first that
, for example
. There sphere bounding intersects the other balls
and negative weight and therefore imaginary radius
before that time. By construction, the Voronoi cells of the
in three caps. The circles bounding these caps lie in the balls are unchanged at all times. It follows that the dual
three planes bounding the Voronoi cell of , and because complexes that arise throughout time are subcomplexes of
lies outside , the three caps are not independent. A one and the same Delaunay triangulation. Furthermore,
lies outside the sphere, the three caps are not independent.
finitely many simplices and therefore only finitely many
any point on the sphere, that is,
subcomplexes of that arise as dual complexes during the
. It can still
be that there is a point outside contained in
growth process. We refer to this sequence as a filtration of
, but then
. In other words, is not independent.
the Delaunay triangulation,
Figure II.12 illustrates the construction by showing three
.
To prove the reverse, we assume that is not indepen- complexes in the filtration generated by eight disks in the
dent. Then intersects the other three balls in three non- plane. To translate between continuous time and discrete
28 II G EOMETRIC M ODELS
Figure II.13: The two larger disks are independent, but the dual
Figure II.12: Three unions of disks and the corresponding dual edge does not belong to the dual complex because their common
complexes. The first complex contains all vertices but only two intersection is disjoint from the corresponding Voronoi edge.
edges and no triangles. From the first to the third complex, the
edges become thinner and the triangles become lighter.
We represent the filtration by sorting the Delaunay sim-
plices by birth-time, and in case of a tie by dimension.
rank, we define a function
.
such that
if
plex
Remaining ties are broken arbitrarily. Every dual com-
is a prefix of this ordering, and because of the tie
breaking rule, every prefix is a complex, even if it does
not coincide with a dual complex. This property of the or-
Ordering simplices. We can sort the Delaunay sim-
dering will be crucial for the algorithm in Chapter IV that
plices in the order in which they enter the dual complex.
computes the connectivity of the .
, ,
Define the birth-time of a simplex as the minimum
time such that for all . The differ-
ence between two contiguous complexes in the filtration Bibliographic notes. Alpha shapes and alpha com-
consists of all simplices whose birth-time coincides with plexes have been introduced by Edelsbrunner, Kirkpatrick
the creation of the second complex, and Seidel [3] in 1983 for finite sets of points in the plane.
About a decade later, the concept has been generalized to
Often two contiguous complexes and differ by
three dimensions and made available as a software pack-
age with graphical user interface [4]. The unexpected
only one simplex, . In this case, the birth-time of coin-
popularity of that software in structural biology triggered
the development of further geometric concepts useful in
sphere of be the smallest sphere orthogonal to all balls
cides with the time it becomes independent. Let the ortho-
structural biology, some of which are explained in this
whose centers are vertices of . The time becomes in-
book. The main reason for the popularity is the duality
dependent is also the time the orthosphere of dies or
between space-filling diagrams and alpha shapes as ex-
plained in this and the two preceding sections. To fully
shrinks to a point. Geometrically, this case is characterized
develop that duality, alpha shapes had to be extended to
by a non-empty common intersection between the affine
take into account weights, and this has been described in
hull of and the Voronoi cells of its vertices. Sometimes,
complete generality in [2]. That generalization benefit-
however, the difference between and consists of
ted from adopting the language of simplicial complexes,
two or more simplices. In the generic case, all these sim-
which has been developed decades earlier in the area of
plices are faces of a single simplex, , that also belongs
,
combinatorial topology [1, 5].
to the difference. All these simplices are born at the same
time, . In the absence of any degeneracy,
[1] P. S. A LEXANDROV. Combinatorial Topology. Dover, New
,
their orthospheres die at different times, with the ortho-
York, 1998 (republication of translation of the original Rus-
sphere of dying last at time . Figure II.13 illustrates
sian edition from 1947).
this case. The triangle connecting all three centers and
the edge connecting the centers of the two larger disks are [2] H. E DELSBRUNNER . The union of balls and its dual shape.
born at the same time, namely when all three disks reach Discrete Comput. Geom. 13 (1995), 415–440.
II.3 Alpha Shapes 29
II.4 Alpha Shape Software tains a line for each atom listing its three coordinates and
the van der Waals radius. The -r option allows for the
This section introduces the basic Alpha Shape software specification of a radius increment that is applied to every
and explains how to go from a standard descriptions of atom in the file. In our example, this radius increment is
protein structures to the visualization of their alpha shapes. 1.4 Å, which is the most common approximation used for
The discussion is more descriptive and less analytical the size of water molecules. The resulting set of balls thus
than in the previous three sections. Given a pdb-file, defines the solvent accessible diagram representing the in-
name.pdb, we take four steps to construct and visualize teraction with the surrounding water; see Section II.1.
alpha shapes in an interactive graphical user interface:
Delaunay triangulation. The first step towards comput-
> pdb2alf name.pdb name
ing alpha shapes is to construct the Delaunay triangulation
> delcx name
of the set of balls. This is accomplished by the command
> mkalf name
> alvis name
> delcx name
The details of the discussion apply to Version 4.1 of the
The aunay omple program creates a file name.dt
Alpha Shape software executed on an SGI workstation
running under the UNIX operating system and may differ that represents the Delaunay triangulation. The efficient
for other versions and platforms.
and robust construction of the Delaunay triangulation in
is not entirely straightforward. We briefly mention the
algorithmic ingredients used. The basic strategy is incre-
Data format. The main public source for structural pro-
mental, adding one ball at a time to the triangulation. Us-
tein data is the Protein Data Bank (pbd) mentioned in Sec-
ing an arbitrary ordering of the balls, we write for the
tion I.3. Only a fraction of the information is needed to
construct alpha shapes. Specifically, for each atom we
of , for
set of the first balls and
for the Delaunay triangulation
. With this notation, the algorithm
only need its coordinates in three-dimensional space and can be written as follows.
;
its radius. The coordinates are explicitely given in the file,
but the radius must be inferred from the atom type. This
is done according to published translation tables that map
for
to do
I NSERT
endfor.
atoms to van der Waals radii. Unfortunately, there is no
universally agreed upon table. Some differences are due
to different methods used to derive radii, including mea- The -th ball is inserted through a sequence of flip opera-
surements of closest approach, molecular mechanics cal- tions. The flips are performed depending on the outcomes
culations, etc. One of the most problematic elements is of only two types of primitive tests needed in the construc-
hydrogen (H), which accounts for almost 50% of the num- tion of the Delaunay triangulation:
ber of atoms found in organic matter. Hydrogen atoms
sometimes donate their electrons to complete the shells of O RTHOGONALITY: decide whether a ball is closer or fur-
other atoms and thus can exist without any shell and ra- ther than orthogonal to the orthosphere of four other
dius to speak of. Hydrogen atoms are generally not repre- balls.
sented in pdb-files, but can be inferred to some accuracy O RIENTATION : decide whether a ball center is on the pos-
from the types and relative positions of the other atoms in itive or negative side of the oriented plane spanned by
the protein. In the common unified atom model, the van three other ball centers.
der Waals radii of larger atoms are adjusted to include the
bonded hydrogen atoms. Both tests reduce to the sign of the determinant of a small
We can extract the coordinates and the radii using soft- matrix and can be decided without computing intermedi-
ware that is part of the Alpha Shapes distribution. Specif- ate geometric information. The operations are ambiguous
ically, we call if the balls are in non-generic position, and so is the De-
launay triangulation. To cope with the related robustness
> pdb2alf -r 1.4 name.pdb name problem, we use exact arithmetic and simulated perturba-
tion. Exact arithmetic guarantees the correct execution of
to read name.pdb and create a new file name that con- flips in all generic and therefore unambiguous cases, and
II.4 Alpha Shape Software 31
exact arithmetic if the error is too large to guarantee a cor- The software refers to the sorted sequence of simplices
rect decision.
as the ‘masterlist’. It stores each simplex several
Another challenge to the efficiency of the code is the times, marking when is born, when becomes a face
inherent size of the Delaunay triangulation. As mentioned of another simplex, and when becomes interior to the
, , ,
in Section II.2, the Delaunay triangulation in can have alpha
complex. Suppose the three events happen at times
a number of simplices that is quadratic in . For exam- . Then
ple, if the centers of the balls lie on the moment curve
,
,
,
not in if
and all radii are equal, then every pair of vertices forms
, ,
singular if
an edge in the Delaunay triangulation, as shown in Fig- is
ure II.14. Fortunately, the balls of organic molecules are regular
interior
if
if
,
The combinatorial topology term for being singular is
principal and means that is not a face of any other sim-
plex. The simplex is regular if it belongs to the bound-
ary but is not principal, and it is interior if it is completely
surrounded by other simplices. Some of the three events
, , ,
may coincide. For example,
a tetrahedron is interior as
soon as it is born, so
boundary of
. A simplex in the
can never become interior, so . ,
, ,
Finally, a simplex whose orthosphere dies strictly
before
the simplex is born is never singular, so . The
main reason for recording all this information is to deter-
Figure II.14: Edge-skeleton of the Delaunay triangulation of
twenty one points on the moment curve in .
mine how to draw in the graphical interface, but there
are others. Figure II.15 shows four alpha complexes of the
relatively small gramicidin protein. In each case, we only
usually well packed and have Delaunay triangulations of
show the singular simplices together with the regular tri-
size at most proportional to . The danger remains that
one of the intermediate triangulations is large. Then we
angles. Given a value of , we need quick access to the
simplices of the various types in . For this purpose, we
, ,
All signatures that count rather than measure are displayed
in log-scale. Instead of mapping the time to a property of
Bibliographic notes. The Alpha Shape software was
of
, the signatures map the index
to the property
. To facilitate the reconstruction of the map
created by Ernst Mücke as part of his doctoral work at
Urbana-Champaign. The best documentation of the algo-
from time, the panel contains a signature that maps the in- rithm and data structures used in the software are still his
dex to time. Specifically, it shows the log-scale graph of
. A particular index, , is selected by the position of a
thesis [6] and the original paper on the topic [4]. After a
period of rapid development directed by Ping Fu at the Na-
tional Center for Supercomputing Applications, the soft-
vertical bar in the signature panel and by clicking the Al-
ware reached version 4.1 in 1996, which is still the most
pha Shape button in the scene panel, as shown in Figure
II.17. The buttons in the middle of the scene panel provide recent version distributed on the web [7]. The Delaunay
triangulation software in the Alpha Shapes distribution is
control over how simplices are drawn: colored, shaded, in
wireframe, seamless, or with gaps created through a slow based on a variety of algorithmic techniques described in
explosion. The matrix on the right hand side can be used a recent text by Edelsbrunner [3]. The interval tree used
for fast retrieval of simplices is explained in [2].
to select the types of displayed simplices. By default, only
the singular vertices, edges, triangles and the regular trian- As mentioned earlier, the largest resource for structural
gles are shown. Different settings can be used to highlight protein data is the Protein Data Bank [1], which can be
different aspects of an alpha complex. For example, the accessed via the web [8]. A survey of geometric measure-
II.4 Alpha Shape Software 33
.
.
Exercises
(ii) Show that
(i) Show that
1. Tree-like sequences. Given an alphabet of
letters, form a sequence but refrain from placing any . The
[We note that the relation in (ii) neatly generalizes
4
letter twice in a row. The sequence is tree-like if the formula
there are no two letters that alternate more generalization is not quite as neat if we sum powers
rather than binomial coefficients.]
$4 4 4 4
than twice. In other words, subsequences of the form
4 H4
and are prohibited. Examples of
5. Sphere arrangements. Let
be the maximum
4 H4 4 4
tree-like sequences of four letters are and number of cells we get by drawing spheres in .
.
(i) Show that unless .
of letters has length at most @
(i) Prove that a tree-like sequence over an alphabet
. Is this
(ii) Give a formula for that works for all posi-
bound tight? tive .
(ii) Define a tree-like cyclic sequence by pro- [You might consider answering question (ii) before
4 4
hibiting cyclic subsequences of the form question (i).]
. Prove that a tree-like cyclic se-
6. Independent half-spaces. A half-plane is the set of
@
quence over an alphabet of letters has length
at most . Is this bound tight? points on or on one side of a line in . Similarly, a
half-space is the set of points on or on one side of a
plane in , and a cap is the intersection of a sphere
the plane. The boundary of the union of the disks with a half-space. What is the maximum number of
consists of circular arcs contributed by the circles. independent
(i) Assuming the boundary of is a single (i) half-planes in ,
@
closed curve, use tree-like cyclic sequences to
(ii) half-spaces in ,
prove that it consists of at most (maxi-
mal) circular arcs. Is this bound tight? (iii) caps on a sphere in ?
(ii) Prove that in general the number of (maximal) 7. The filtration of water. A water molecule consists
most
. Is this bound tight?
circular arcs in the boundary of the union is at of one oxygen and two hydrogens: H O.
3. Empty Voronoi cell. Call a disk in a finite collec- (i) Look up the standard geometric model (deter-
mined by radii, bond length and bond angle).
tion of disks redundant if its Voronoi cell is empty. (ii) Describe the Voronoi diagram and the sequence
(i) Prove that if there are disks , and in the of alpha complexes of the model.
collection such that
* * * *
8. Barycentric subdivision. The barycentric subdi-
(a)
*
for the vision of a simplex is obtained by adding the
orthocenter of , and
barycenter of (also known as the centroid or cen-
(b) lies in the triangle ter of mass) as a new vertex and connecting it to the
then is redundant. simplices in the barycentric subdivisions of the faces.
(ii) Prove that the necessary conditions given in (i)
(i) How many vertices, edges, triangles and tetra-
are also sufficient. In other words, prove that if
is redundant then there exist disks , and hedra are in the barycentric subdivision of a
that satisfy Conditions (a) and (b).
tetrahedron?
the number of ways we can choose elements from
# [You will need to use weights to make the barycentric
a collection of elements. Recall also that
subdivision of the tetrahedron the Delaunay triangu-
%
lation of the points.]
Chapter III
Surface Meshing
Recall the different types of space-filling diagrams we we use that software to illustrate some of the properties of
discussed in Chapter II. The van der Waals and the solvent these curves and surfaces.
accessible models are both unions of finitely many balls
in three-dimensional space and differ only in the radii. We
have also discussed the molecular surface model that is ob-
tained by rolling a sphere about the van der Waals model. III.1 Molecular Skin
Corners and crevices are filled up and the surface consists III.2 Curvature
of spheres connected by blending torus patches and in- III.3 Adaptive Meshing
verted sphere patches. III.4 Skin Software
Exercises
In this chapter, we introduce model that is similar to
the molecular surface. Its surface consists of spheres
connected by blending hyperboloid patches and inverted
sphere patches. We call this the molecular skin model.
The surface is piecewise quadratic and has a number of
attractive properties not shared by the other space-filling
models. One is the continuity of the normal direction, an-
other the continuity of the maximum principal curvature.
Both properties are crucial for the construction of good
quality meshes, which may be used to support numerical
computations over the surface. Another interesting prop-
erty is an inside-outside symmetry that implies the exis-
tence of locally perfectly complementary molecular skin
models. In other words, for each cavity we may construct
a molecular skin representation whose boundary matches
that of the molecule. The molecular skin also lends itself
to represent deformations, and some of the possibilities
along these lines will be discussed in Chapter VIII.
This chapter is organized in four sections. In Section
III.1, we give the geometric definition of the molecular
skin and show how it can be decomposed into quadratic
patches. In Section III.2, we discuss various notions of
curvature of a surface, and we show that the maximal
principal curvature is a continuous map over the molec-
ular skin. In Section III.3, we describe the algorithm that
constructs a molecular skin in terms of a triangle mesh. Fi-
nally in Section III.4, we present software for constructing
molecular skin in two- and three-dimensional space, and
35
36 III S URFACE M ESHING
* *
section points. Indeed, if then
though the case of spheres in is most relevant for the
for all coefficients and .
study of molecules, there is sufficient pedagogical advan-
We call the resulting family a pencil of circles. If and
tage to first talk about circles in .
are disjoint then the affine hull is again a pencil but this
time of pairwise disjoint circles, like the vertical family
Circles and paraboloids. Recall that the weighted sketched in Figure III.2. We compute the center and ra-
square distance function of a circle
is the
map
defined by
*
. F* F
As illustrated in Figure III.1, its graph is a paraboloid
of revolution in
that intersects in the circle.
In other words, the circle is the zero-set of the weighted
square distance function,
. All paraboloids
* * * * 4 * ) *
that arise as weighted
square distance functions have the
form . The three pa-
rameters correspond to the three degrees of freedom rep-
resented by the center and the radius.
F* F
FH* F
FH* F
F F
F F F F
F F
of zero-sets of convex combinations,
F F F F
2
III.1 Molecular Skin 37
which is
and thus vanishes as required.
for fixed value of . The collection of all reduced
circles
Suppose we are now given two circles and and two is the projection of the entire zero-set, . It can be
more circles and both orthogonal to and . Then visualized as a leaning hour-glass of circles, as in Figure
every circle in the affine hull of and is orthogonal to III.4. The envelope of is the projection of the silhou-
ette of as viewed along the direction. It is the
*
*
both and and thus to every circle in the affine hull of
set of points for which
and . In other words, we have two pencils in which
@* :* :* @* J* *
each circle in the first pencil is orthogonal to each circle vanishes. From we get . The envelope is
in the second pencil. Such a configuration is illustrated in therefore the zero-set of ,
Figure III.2 and is referred to as a coaxal system. which is a hyperbola.
Envelopes. The convex hull of two circles is an infinite Skin and body. More general curves than just hyperbo-
family of circles, but the union of their disks is just the las can be constructed by taking the convex hull of a fi-
union of the two original disks. We introduce a shrinking nite collection of circles, then shrinking every circle in the
operation that reduces small circles less than big ones and family, and finally taking the envelope. Formally, the skin
this way generates a smooth
envelope. Specifically, we de-
of the collection of circles is the envelope of the reduced
. The body is the union of
fine
we define
Figure III.3: The dotted circles belong to the affine hull and the
solid circles are reduced.
Figure III.5: The skin of two intersecting circles is the envelope
of a reduced line segment of circles.
We are interested in the envelope of a shrunken pencil.
*
Suppose is a pencil and
all its circles pass through the
points and . We parametrize by the - The skin of three circles is already more difficult to un-
coordinate of the circle centers. The corresponding ra- derstand, at least directly. We thus take an indirect ap-
dius is
. The same parametrization of the family proach and first study what happens when orthogonal cir-
of reduced circles, , gives cles shrink.
:* *
*
*
The reduced circle with center
is the zero-set of
Orthogonality and complementarity. Let
and
be two orthogonal circles. We thus have
F F
Taking roots left and right implies that the radii of and
add up to at most the distance between the two cen-
Figure III.4: Sections of the zero-set of viewed from the posi- ters. Furthermore, we have equality iff . In other
tive direction. words, the reduced versions of any two orthogonal circles
38 III S URFACE M ESHING
touch if they are of the same size and they are disjoint in skin of consists of circles, connected to each other by
all other cases. blending hyperbola and inverted circle arcs. We will not
prove this claim and instead give an explicit construction
We apply this result to the coaxal system consisting of
of the decomposition, which is facilitated by a complex
orthogonal pencils and . Suppose contains only cir-
assembled from Voronoi and Delaunay polyhedra.
cles with real radii, or equivalently, is the affine hull of
two intersecting circles. As shown earlier, the envelope of
As usual, we let be an index set and use it to de-
2
is a hyperbola. We claim that the envelope of is note the Voronoi polyhedron . The corre-
the exact same hyperbola. To see this, we first note that sponding Delaunay simplex is
.
a circle in
can at most touch the hyperbola, for if it The corresponding mixed cell is the Minkowski sum of
crossed, we would have two crossing reduced circles con- shrunken copies of both, . If
the mixed cell is the shrunken and translated
tradicting the orthogonality of the two corresponding orig- copy of a
inal circles. Furthermore, every circle in two-dimensional Voronoi cell. If then
for which is
there is an equally large circle in
touches the hyper- the Minkowski sum of two orthogonal edges and there-
bola because it touches that circle. The two envelopes are fore a rectangle. If
then is a shrunken and
therefore the same hyperbola. As shown in Figure III.6, translated copy of a Delaunay triangle. The mixed complex
the two asymptotic lines of the hyperbola intersect at a consists of all mixed cells and their faces. Figure III.7 il-
right angle. The smallest separating circle that touches lustrates the construction by showing the mixed complex
both branches belongs to and has the same size as the decomposing the skin into circle and hyperbola arcs. A
two osculating circles that both belong to . These cir-
cles touch the hyperbola and have the same curvature as
the hyperbola at that point.
Figure III.7: The mixed complex and the skin of four circles.
The complementarity of the bodies extends from the then obtained by intersecting the pyramids and tetrahedra
case of two orthogonal pencils to the case in which con-
sists of a single circle and contains all circles orthog-
with the plane parallel to and halfway between the other
onal to . The set is a two-parameter family spanned
by three circles. The skin of is trivially a circle, which
implies that the skin of is the same circle. Symmetry. Note that the construction of the mixed
complex is symmetric in the Voronoi diagram and the De-
launay triangulation. In other words, the mixed complex
Decomposition. The skin of any finite set of circles
can be decomposed into simple pieces, each defined by at
of
is the same as the mixed complex of the collec-
tion of circles introduced in Section V.1. [The order
most three of the circles. A single circle defines a (smaller) of the chapters on skin and pockets has changed now,
circle, a pair of circles defines a hyperbola, and a triplet of
circles defines an inverted circle. We thus claim that the
centered at each
which requires a local rewrite here and in Section III.4.]
As explained there, contains a circle
III.1 Molecular Skin 39
Figure III.8: The top, middle, and bottom planes carry the Delau-
nay triangulation, the mixed complex, and the Voronoi diagram.
Voronoi vertex (including those at infinity) with the ra-
dius chosen so that is orthogonal to the circles that de-
fine . The Voronoi diagram of is then the Delaunay
triangulation of , the Delaunay triangulation of is the
Voronoi diagram of , and the mixed complexes of and
are the same. We have seen that the skins of two orthog-
onal pencils are the same hyperbola. Similarly, the skins
of one circle and the affine hull of three orthogonal circles
are the same circle. Since the mixed complex decomposes
the entire skin of into such cases, it follows that the skin
of is the same as that of . Note however that the two
bodies are not the same but rather complementary,
9F F
identifies each circle in with the point
in . Under this interpretation, the convex
hull of a set of circles corresponds to the usual convex hull
of points in , and the symmetry between and can
be explained as a polarity between two convex polyhedra.
This interpretation is prominently used in the geometry
text by Pedoe [5]. It has been discovered in the nineteenth
century and published at more or less then same time in
three different languages by Clifford [1], Darboux [2], and
Frobenius [4].
The material of this section is taken from [3], where
skin surfaces are introduced as orientable
-
manifolds in . That paper also proves that the body of
a finite collection of spheres has the same homotopy type
as the dual complex.
III.2 Curvature
an open set in , and
a parametrization.
a small number of derivatives, and the assumption of the we define the curvature at in sections. For
each curve
existence of infinitely many is convenient but not neces-
*
in the plane we consider the space curve
. It is a
sary. Note that a curve has a parametrization and the
counter-clockwise orientation of the circle gives a sense
,
geodesic at
normal at . The curvature of
2
E ULER ’ S T HEOREM . The directions and are or-
thogonal, and if
then
2
Figure III.9: A closed space curve to the left and its Gauss map
to the right.
then all other normal cur-
, 5 , GF vector
5 , F ,iswhich
the normalized curvatures are the same and the point is an umbilic point
as long as ,
second derivative, is defined of the surface. Two other common notions
of curvature
. Geometrically, the curvature is one
over the radius of the osculating circle at
, , which sian curvature,
are the mean curvature,
. In contrast
, and the Gaus-
to the other no-
is the circle in the plane spanned by the tangent vector and tions, the Gaussian curvature is intrinsic. In other words, it
the normal vector. is preserved by isometries, which are transformations that
preserve the distance between points measured as lengths
of connecting paths. This is a famous result of Gauss.
Surfaces. Let
. For a point
be a smooth surface or 2-manifold in
, we let be a neighborhood,
T HEOREMA E GREGIUM .
is an isometric invariant.
III.2 Curvature 41
*
The second equation defines a hyperboloid with the apex
at the origin, the symmetry axis along , and the sym-
metry plane
* . We have a one-sheeted hyperboloid
for and a two-sheeted one for , as illustrated in
Figure III.12. For the sphere, the normal curvature at ev-
Figure III.11: Typical mixed cells
to right we have
and 4.
. From left
Table III.1: The cardinality of listed in the first column deter- ery point is in every tangent direction. The situation is
mines the dimensions of the corresponding Voronoi polyhedron
and Delaunay simplex as well as the type of the mixed cell and
more complicated for the hyperboloid. Consider the hy-
perbola in standard form in , as shown in Figure III.13,
of the skin patch. and note that both the one-sheeted and the two-sheeted hy-
perboloid can be obtained by rotating the hyperbola about
cases are symmetric and differ from each other by the sur-
a symmetry axis. In either case, the maximum normal cur-
face orientation: in the case
, the body lies
locally inside, and in the case , it lies locally
outside the sphere. Similarly, the two hyperboloid cases
are symmetric and differ from each other by the surface
orientation. In the case , the symmetry axis
of the hyperboloid is the affine hull of the Delaunay edge
and the (orthogonal) symmetry plane is the affine hull of
the Voronoi polygon. We have a one-sheeted hyperboloid r
r
if the two spheres intersect in a circle and a two-sheeted x
r
one if they are disjoint. The common limiting case is a
double-cone defined by two touching spheres. Either way,
the body is on the side of the infinite ends of the symmetry
axis. In the case
, the symmetry plane is the
affine hull of the Delaunay triangle and the symmetry axis Figure III.13: Every point of the hyperbola is sandwiched be-
is the affine hull of the Voronoi edge. Whether the hyper- tween two equally large circles.
the largest sphere that passes through and touches but * [3] H. E DELSBRUNNER . Deformable smooth surface design.
does not cross the hyperboloid. As shown in Figure III.13,
*
Discrete Comput. Geom. 21 (1999), 87–115.
* / *
this radius is the same as the distance of from the ori-
gin. In short,
[4] B. O’N EILL . Elementary Differential Geometry. Second
for every point of a sphere or
edition, Academic Press, San Diego, 1997.
hyperboloid in standard form.
*
By applying this to the pieces of the line segment from
to contained in different mixed cells, we obtain the
result.
We note that the extension of to a function
describes the maximal normal function of all skin surfaces
in the family defined by the power growth model of the
spheres, as introduced in Section II.2.
Delaunay triangulation,
plices
, is the collection of sim-
with non-empty common
it follows that the simplicial complex is the closure of its
intersection of the corresponding restricted Voronoi cells,
triangle set, every edge belongs to exactly two triangles,
and the star of every vertex forms a disk. Note that the last
sample points and to compute their maximum normal cur- Delaunay triangulation is that it may not be homeomor-
vature values. Specifically, for each Delaunay simplex
phic to and thus not triangulate the surface. Indeed,
we construct the mixed cell . The cen- it is easy to come up with cases where is not even
ter of this cell is the point at which the affine
hull of a 2-manifold. A sufficient condition for to triangu-
intersects the affine hull of . It is also the center of late is what we call the closed ball property. It requires
the corresponding sphere or the apex of the corresponding that each common intersection of restricted Voronoi cells
hyperboloid. Next, we rotate the mixed cell so its center is topologically a closed ball of the appropriate dimen-
moves to the origin. Furthermore, if or is an edge sion. We formulate this condition in terms of the three-
then we rotate it into vertical position. The sphere or hy- dimensional Voronoi polyhedra defined by . Assuming
perboloid defined by is then in standard form, which can general position, the Voronoi polyhedron
has dimension
, and we require that
be sampled. For each sampled point we compute the max-
imum normal curvature from its distance to the origin and
is either empty or homeomorphic to a closed ball
we obtain the corresponding point on by the inverse
of dimension
. Depending on the cardinality
rotation.
of we have a closed disk, a closed interval, or a single
point.
Let be the set of points sampled on . We use it as Proving that the closed ball property implies
tri-
the vertex set of the triangulation, which we construct as angulates is not difficult. Decompose the restricted
the dual of a decomposition of . Specifically, for each Voronoi diagram by adding a point in the middle of each
44 III S URFACE M ESHING
arc and inside each cell and connect each point to the arbitrarily ugly. To improve the mesh, we impose condi-
points on the boundary. The star of every point inside a re- tions on the size of edges and triangles that imply both
stricted cell is a triangular decomposition of that cell. The upper and lower bounds on the spacing between sampled
star of every restricted Voronoi vertex consists of six tri- points.
angular regions that can be homeomorphically mapped to
,
Let the size of an edge be half its length,
the six triangles in the barycentric subdivision of the dual
and the size of a triangle be the radius of its circumcir-
restricted Delaunay triangle. By construction of , the
cle, . For edges we worry about them getting too
triangles in the two barycentric subdivisions are connected
the same way so we have a homeomorphism between at the endpoints,
short, so we compare size with the larger length scale
. For trian-
. We use two constants,
-sampling. The question remains how we sample the and , to express the conditions on the size. The constant
controls how closely the triangulation approximates ,
points such that the restricted Voronoi diagram has the
closed ball property. Since is smooth, small neigh- and controls the quality of the triangles. We refer to the
borhoods are fairly flat and the restricted Voronoi diagram two conditions as the Lower and Upper Size Bounds,
behaves locally similar to the (unrestricted) Voronoi dia-
for every edge ,
*
property. This intuition can be made precise by formaliz-
ing the concept of density. Recall that
It is not necessary to bound the edge lengths from above
* *
is the max-
because an edge with
. Around we would belong to
*
imum normal curvature at a point
spread points at distance roughly proportional to .
We therefore define
*
* two triangles that both violate [L]. Symmetrically, we do
*
and call it the length
not need to bound the triangle sizes from below because
*
scale at . The Curvature Variation Lemma of Section
III.2 states that for any two points , the differ- a triangle with
that violate [L].
would have three edges
* FH*
ence in length scale is at most the distance between them
F
in ,
.
such that for each
Mesh quality. The constants and have to be chosen
*
An -sampling is a subset
there judiciously. For example would immediately lead
FH* F *
point exists a point at distance
. Showing that a sufficiently small
implies the closed ball property for the restricted Voronoi
to irreconcilable requirements on edge and triangle sizes.
Furthermore, cannot be too large, else we would con-
diagram is rather tedious and we omit the proof. tradict the -sampling condition stated in the Homeomor-
and
phism Theorem. Without going into details, we state that
are feasible choices. In particu-
with
H OMEOMORPHISM T HEOREM . If is an -sampling of
, then the restricted Delau-
lar, these constants imply that is an -sampling for suffi-
nay triangulation of is homeomorphic to . ciently small value of . More precisely, they imply that
is either an -sampling or it grossly violates the condition
for -sampling. An example of such a gross violation are
The precise upper bound for is a root of the function
four points close together on a sphere. The points form a
2
tetrahedron whose edges and triangles may very well sat-
isfy the Size Bounds, but the boundary of the tetrahedron
is a miserable approximation of the much larger sphere.
which arises in the proof of the Homeomorphism Theo- Fortunately, such a gross violation of the condition cannot
rem. be created from an -sampling without the intermediate
generation of triangles that grossly violate [U]. The algo-
rithm discussed below is unable to generate such triangles.
Even sampling. The points of an -sampling can locally
not be too far apart, but they can be arbitrarily close to- The two Size Bounds together imply a reasonably large
gether. In other words, on a microscopic scale, the points lower bound on the angles inside triangles of the restricted
can be placed every way one likes and the mesh can be Delaunay triangulation.
III.3 Adaptive Meshing 45
M INIMUM A NGLE L EMMA . A triangle that satisfies [U] violate the Upper Size Bound. It is possible that an edge
and whose edges satisfy [L] has minimum angle
larger than
.
8 contraction causes a vertex insertion, but a vertex inser-
tion cannot create edges of size below the allowed thresh-
old. This is what prevents infinite loops in spite of the
P ROOF. Let be the triangle and its cir- algorithm’s partially conflicting efforts to simultaneously
F
F
cumradius. Assuming is the smallest angle, we avoid short edges and large triangles. To prove this claim,
*+
have of length
as the short- we consider a triangle that causes the addition of its
est edge. We have by definition of length dual restricted Voronoi vertex .
* created dur-
scale. Using [L] and [U] we thus get
N O -S HORT-E DGE L EMMA . Every edge
ing the addition of has ratio * / / .
F F
8 .
*
*
Hence
For
, the minimum angle is thus larger than
side. Every new edge has therefore length
, and the maximum angle is smaller than
. Assume without loss of generality that
.
. We use the Curvature Variation Lemma to
derive upper bounds for the length scales at and : *
Density modification. Given an -sampling, we can en-
force the Size Bounds by contracting short edges and in- * F * F
tion may cause new violations of [U] and thus trigger new Hence
point insertions.
/ FH* F *
void V ERTEX I NSERTION:
while triangle
*
violating [U] do
For
and
we have
and
endwhile.
therefore
/ / , as claimed.
*
The details of the algorithm that modifies the restricted Scheduling. [Summarize the results on scheduling edge
Delaunay triangulation to reflect the addition of are contractions and vertex insertions described in [5].]
omitted. A vertex insertion may cause other vertex in-
sertions, but this cannot go on forever because we will
Bibliographic notes. The restricted Delaunay triangula-
eventually violate the Lower Size Bound. Given an edge
that violates [L], we contract it by removing one of its tion is a generalization of the dual complex of a ball union.
endpoints. We are not able to exclude the possibility that It can be used to triangulate surfaces and other spaces em-
the removal creates new violations of [L], and it certainly bedded in a Euclidean space. Besides the dual complex
can create new violations of [U]. literature, there are several other partially dependent roots
of the idea, namely the surface meshing method by Chew
[3], the neural net work by Martinetz and Schulten [6],
void E DGE C ONTRACTION:
while edge the formulation of the closed ball property by Edelsbrun-
violating [L] do
if
then endif; ner and Shah [4], and the surface reconstruction algorithm
by Amenta and Bern [1]. The last of the four papers also
; V ERTEX I NSERTION
endwhile. introduces -samplings of surfaces, although in a slightly
different formulation in which the distance to the medial
The details of the algorithm are again omitted. An edge axis replaces the length scale.
contraction may perhaps cause other edge contractions, All results that are specific to skin surfaces are taken
but this cannot go on forever because we will eventually from [2]. The algorithm in that paper is more general than
46 III S URFACE M ESHING
Figure III.16: Voronoi decomposition of disk union with super- Simulated smoothing. We return to an issue left open in
imposed skin, body, and dual complex.
Section V.1, where we considered the minimum weighted
square distance function of a collection of
union contains the body and the body contains the dual
,
circles . The zero-set of is the envelope of the cir-
,
complex. Furthermore, the disk union, the body, and the cles , and the preimage of any real value is
dual complex all have the same homotopy type. This is
the envelope of the circles
,
. Following the
always true. The skin shrinks the arcs in the boundary of
,
notation in Section II.3, we think of as time and de-
the disk union and smoothly blends between the shrunken note the collection of circles at time by
. In
arcs using pieces of hyperbolas and inverted circles. Most
Section V.1 we claimed that there is an infinite family of
striking is the blending for the quadrangular hole roughly smooth approximations of that all have
in the middle of the figure, which is converted into an al- the same critical points, namely the points where dually
most entirely circular hole in the body.
corresponding Voronoi
and Delaunay polyhedra intersect.
We choose and construct the family such that
Mixed complex. Using the Morfi software, we can visu- and approaches as goes to 1. One function
*
,;
alize concepts that are difficult if not impossible to show in this family is the trajectory of the skin curve, , that
to the moment in time
*
in . An example is the mixed complex illustrated in maps each point
;
Figure III.17. It decomposes the skin into circular and hy- at which belongs to the skin of . We generalize this
construction to any
Note that
is the skin as defined in Section III.1,
and
is the envelope of the original disks. Figure
III.18 illustrates the construction by showing the modi-
fied skins for several values of . Observe that the bod-
Figure III.18: From inside out the sequence of skins for Figure III.19: Cut-away view of the mesh of a small molecule
.
of about forty atoms. Only the edges of the mesh and the cut
boundary are shown.
with , the height function that highly curved areas detectable in Figure III.20 corre-
is differentiable and assuming non-degeneracy of the spond to high density regions in Figure III.19.
input circles, it is twice differentiable at the critical points.
This is sufficient to justify the Morse theoretic reasoning
about the non-smooth function used in Section V.1 to Growing the mesh. As mentioned earlier, the mesh is
define pockets. constructed by maintaining it while growing the spheres.
The algorithm thus reduces to executing a sequence of ele-
mentary operations. We classify the operations according
to the adaptation purpose they serve.
Meshed skin surfaces. In , we compute triangulated
skin surfaces using the Skin Meshing software. It takes as
input a set of spheres and constructs a mesh by main- Shape adaptation. The growth of the spheres im-
plies a deformation of the surface, which is facilitated
,
taining a triangulation of the set of spheres , with the
time continuously increasing from minus infin- by a motion of the mesh vertices in . The algo-
ity to zero. At the beginning, all spheres are imaginary, rithm moves vertices normal to the surface, along the
the skin is the empty surface, and the mesh is the empty
integral
lines of the skin trajectory, which is
. We use edge flips to maintain the mesh as
,
complex. As time increases, the surface moves and the
software updates the mesh accordingly. At time , the restricted Delaunay triangulation of the moving
we have the mesh of the skin of . Figure III.19 shows vertices.
a portion of this mesh for a small molecule. The image Curvature adaptation. Recall that the conditions
is created by slicing the surface with a plane and remov- [L] and [U] given in Section III.3 guarantee that the
ing the front portion of the surface. The complete surface mesh adapts its local density to the maximum nor-
has genus one, and the slicing plane is chosen to cut right mal curvature. We use edge contractions to eliminate
through the narrow part of the tunnel. The image of the edges that violate [L] and vertex insertions to elimi-
mesh in Figure III.19 should be compared with the ren- nate triangles that violate [U].
III.4 Skin Software 49
are , , and , which control how the metamorphoses
are performed. The correctness of the algorithm is guar-
anteed only if the inequalities referred to as Conditions (I)
to (V) are all satisfied. The software permits other param-
eter settings since a violation of the inequalities does not
necessarily imply a failure of the algorithm. In our ex-
perience, the software works fine for small violations but
breaks down for moderate ones.
Topology adaptation. There are four types of Figure III.22: The quantification panel of the Skin Meshing soft-
topological changes that occur, and they correspond ware. The quality measures do not include the special edges and
to the four types of generic critical points of three- triangles that facilitate topological changes and purposely violate
dimensional Morse functions. A component is born some of the properties required for the rest of the mesh. [This
at a minimum, a handle is created at an index-1 sad- panel needs to be updated to fit the text.]
dle, a tunnel is closed at an index-2 saddle, and a void
is filled at a maximum. We use metamorphoses to
Figure III.22 shows the panel after the construction of a
change the mesh connectivity accordingly.
mesh. It displays measurements of mesh quality, includ-
ing size versus length scale ratios of edges and triangles
Two of the four types of metamorphoses can be seen at and the angles inside and between triangles. Note that in
work in Figure III.21. From the first snapshot to the sec- Figure III.22, the ratios all lie inside the allowed interval,
ond, we see two new handles appear. Each handle creates
which is . As proved in Section III.3, the algo-
8
a tunnel in the complement. From the second snapshot to rithm guarantees that the smallest angle inside any (non-
the third, we see both tunnels disappear again. By closing
a tunnel we also remove the handle that forms it. Observe the standard setting of
special) triangle in the mesh is larger than
. For
, this is roughly ,
that the surface around a handle is the same as that around and the smallest angle observed in the mesh is indeed
a tunnel, namely a two-sheeted hyperboloid that flips over .
to a one-sheeted hyperboloid, or vice versa. The only dif-
ference is the reversal of inside and outside.
Figure III.21: Three snap-shots of the deforming triangulation of a molecular skin defined by continuously growing spheres. From left
to center, we note two metamorphoses that each add a handle in the front. From center to right, we note a metamorphosis that closes a
tunnel on the left.
and
for the heights of and . Prove that the radius
1. Pencils of spheres. Let us extend the concept of a of the circumcircle satisfies
coaxal system of circles to three dimensions. For this
F
F F C
F
purpose assume and are two sphere
orthogonal to the spheres , and .
that are both
F
F F
EF
F
EF
F
CF
(i) Prove that every affine combination of and
is orthogonal to , and .
(ii) Prove that every affine combination of ,
and is orthogonal to and .
(iii) In the light of (i) and (ii), what is the analog of
a coaxal system in ?
2. Curvature in the plane. Note that the curvature
of a molecular skin curve in is not
continuous.
(i) Give an example illustrating that is not con-
tinuous.
(ii) Introduce a new function (perhaps similar to )
that is continuous over .
3. Total curvature. Define the total curvature of a sur-
face as the integral of the maximum principal cur-
vature:
/
* *
(i) Calculate for a sphere .
(ii) Calculate for the portion of a double-cone
within a unit-sphere around its apex.
/
* *
(i) Calculate for a sphere .
(ii) Let be the portion of a hyperboloid of rev-
olution within a unit sphere around the apex.
Show that goes to infinity as the hyperboloid
approaches its asymptotic double-cone.
(iii) Prove that the number of points in a minimal
-sampling of (as defined in Section III.3 is
proportional to .
5. Something about triangles. Let be a triangle
in the plane. We write
for the height of defined
as the distance of from the closest point on the line
52 III S URFACE M ESHING
Chapter IV
Connectivity
Given a shape or a space, we can ask whether or how define homology groups and their ranks, the Betti num-
it is connected. It might not be immediately obvious what bers. In Section IV.3, we describe an incremental algo-
this question means, we can draw from precise definitions rithm for Betti numbers, which is fast but limited to com-
developed in topology to answer the question. However, plexes in three dimensions. In Section IV.4, we present
we need to be aware that there are perfectly well-defined the classic matrix algorithm for Betti numbers, which is
and reasonable but different precise notions that corre- significantly slower but not limited to three-dimensional
spond to the intuitive idea of connectivity. For example, space.
for two spaces and to be “connected the same way”,
could mean they are topologically equivalent ( ),
they are homotopy equivalent ( ), or they have iso-
morphic homology groups ( . The three IV.1 Equivalence of Spaces
notions are progressively weaker: IV.2 Homology Groups
IV.3 Incremental Algorithm
IV.4 Matrix Algorithm
Exercises
In words, the classification of spaces by homology groups
is coarser than that by homotopy equivalence, which in
turn is coarser than that defined by topological equiva-
lence. [We should stress that homology in this topological
context has a precise algebraic meaning, which is in sharp
contrast to how the term is used in biology (eg. homology
modeling of proteins), where it indicates a vague notion of
similarity.]
Given two triangulated spaces, there is a polynomial-
time algorithm that computes and compares their homol-
ogy groups. If the groups are not isomorphic then we
know that the two space are different, meaning they are
neither homotopy equivalent nor topologically equivalent.
However, if their homology groups are isomorphic then
we still do not know whether the two spaces are the same
also under the two stricter definitions of sameness. In spite
of the apparent weakness, homology is the most important
tool to study connectivity. In this chapter, we focus on
algorithms computing the homology groups of molecules
represented by space-filling diagrams. In Section IV.1, we
prove that space-filling diagrams are homotopy equivalent
to their dual alpha shapes, which implies the two have iso-
morphic homology groups. In Section IV.2, we formally
53
54 IV C ONNECTIVITY
topological space is, we can define when two are the same.
A homeomorphism is a bijective map
that is con-
The space-filling diagram of a molecule is a subset of ,
and with induced subspace topology it is a topological tinuous and whose inverse is continuous. We write
if a homeomorphism exists and say that and are home-
space. We study the connectivity of this space by con-
omorphic, topologically equivalent, and that they have the
sidering equivalence classes defined by continuous maps
between spaces. same topological type. Note that the identity is a homeo-
morphism, the inverse of a homeomorphism is a homeo-
morphism, and the composition of two homeomorphisms
is a homeomorphism. In other words, being homeomor-
Topological spaces. Recall that a map
is phic is reflexive, symmetric and transitive, so is indeed
there is a such
* *
continuous if for every an equivalence relation for topological spaces.
that if
have distance less than then the points
have distance less than . To check As suggested by Figure IV.1, there are spaces that have
whether or not is continuous, we thus have be able to the same topological type and look vastly different, and
measure the distance between points in both sets. Accord- there are spaces that look quite similar and do not have the
same topological type. An interesting example of a pair of
ing to a more general definition, is continuous if the
preimage of every open set in is open in . Here we
only need to distinguish between open and non-open sets.
This distinction is the motivation for the following defini-
tion. A topological space is a set together with a system
of subsets of such that
(i)
and ,
(ii)
for every subsystem , and Figure IV.1: The circle on the left is topologically equivalent
(iii)
for every finite subsystem . to the trefoil knot in the middle, but both are not topologically
equivalent to the annulus on the right.
are the open sets of . If , we can induce the
The system is called the topology of and the sets in
non-homeomorphic spaces are the sphere and the plane.
many open sets. For example, the common intersection
of the open balls of points at distance less than
from the origin, for
, is just the origin itself,
which is not an open set. We thus see that the restriction
*
to finite subsystems in condition (iii) is necessary. The
two-dimensional sphere,
FH*KF
Figure IV.2: The stereographic projection maps the sphere (mi-
,
nus the north-pole) to the plane. The lower hemisphere maps to
is a subset of , and if we choose its intersections with
the shaded disk and the upper hemisphere to the complement of
open sets in
as the open sets in its topology, then it is a that disk.
topological subspace of . Another topological subspace
of is the two-dimensional Euclidean plane, .
IV.1 Equivalence of Spaces 55
Homotopy equivalence. Next we introduce an equiva- (iii) , , for all and all ,& .
lence relation that is less sensitive to the local dimension
Note that is a homotopy between , which is the iden-
of spaces than topological equivalence. We begin by com-
tity on , and , which maps to . As illustrated in
paring maps between the same spaces. Two continuous
maps
are homotopic if there is a continu-
* *
Figure IV.4, there is a deformation retraction from the dou-
ous map
with
*
* *+
and ble annulus to the figure-8 curve, but there is no deforma-
and call
, for all . We write tion retraction to the circle. (Why not?)
,'
a homotopy between and . This definition is illustrated
in Figure IV.3. We may think of the parameter as
im k
im H
Figure IV.3: In this example, and both map the circle into
three-dimensional space, and maps the circle times to
. Then is homotopic to the identity on
which is the same as saying that the image of
may be
because
is a homotopy between the two maps. Fur-
self-intersecting. is equal to the identity on
thermore, and therefore
Two spaces and are homotopy equivalent if there certainly homotopic to it.
are
continuous maps and
such that
is homo-
The simplest homotopy type is that of a point. A space
is homotopic to the identity on and is contractible if it is homotopy equivalent to a point. For
topic to the identity on . We write and say that example, a disk is contractible but a circle is not. Simi-
the two spaces have the same homotopy type. Note that
larly, a ball is contractible but a sphere is not.
is reflexive, symmetric, and transitive and is therefore
indeed an equivalence relation for topological spaces. It is
easy to show that two topologically equivalent spaces are Decomposition into joins. We construct a deformation
also homotopy equivalent. To see that the reverse is not retraction between a union of balls and its dual complex
true we note that the annulus in Figure IV.1 is homotopy
using a decomposition into joins. In general, a join be-
equivalent to the circle, but the two are not topologically
tween two sets and in some Euclidean space is the
equivalent.
union of closed line segments that connect points in
with points in ,
/ & *
Deformation retraction. If is a topological subspace
of then we may prove that the two spaces are homotopy
equivalent by constructing a map that retracts to . A
deformation retraction
from to is a continuous and it is defined iff any two such line segments are either
map with disjoint or meet at a common endpoint. Figure IV.5 uses
boundary of . We shrink * by defining
,
, ,
,
turns into a trapezium whose height decreases and reaches
zero at time . A disk sector shrinks from its outer arc
towards its center, which is a vertex of the dual complex.
It maintains its shape while getting smaller until it reaches
the size of a point. The deformation retraction is obtained
Figure IV.5: The union of disks is decomposed into the underly- by shrinking all joins simultaneously. It is illustrated in
ing space of the dual complex and two types of joins connecting
,
Figure IV.6, which shows the image of the retraction at
that complex to the boundary of the union.
time . Figure IV.7 shows an entire sequence of
shapes during the deformation retraction visualized for the
model of gramicidin also shown in Figure II.3.
point and an edge and a sector is the join between a circu-
lar arc and a vertex.
Let be a finite collection of closed balls in . We as-
sume general position and construct a deformation retrac-
tion from the union, , to the underlying space
of the dual complex, . Recall that the bound-
ary of consists of sphere patches separated by circular
arcs connecting corners. To be specific, we define a patch
as the contribution of the sphere bounding to the
boundary of . It does not have to be connected or sim-
ply connected. Similarly, we define an arc and a corner
as the contribution of the intersection of two and of three
spheres to the boundary of . An arc may be a full cir-
Figure IV.6: The decomposition after shrinking the joins half
cle, or any number of intervals along the circle. A corner
way to zero.
may be empty, a point, or a pair of points. The decom-
position is constructed by forming the join between every
patch, arc, and corner and its dual vertex, edge, and trian- There is a technical problem at the very beginning of
gle. Figure IV.5 illustrates the construction in the plane. the shrinking process that arises already in two dimen-
There are four corners that are point pairs, and they corre- sions. Specifically, the outer vertex of each triangle join
spond to the four principal edges of the dual complex. (As belongs to more than one line segment and thus retracts
defined in Section II.4, an edge is principal if it is not face towards more than one point of the dual complex. To fi-
of any other simplex in the complex. In the Alpha Shape nesse this difficulty, we choose and move the points
software, such an edge is referred to as singular.) There differently in the time interval . In the assumed case
are also four arcs that consist of more than one component in which is in general position, this initial motion needs
,
each, and they correspond to the vertices on the boundary to bridge the non-zero gap between the boundary of and
of the dual complex that are exposed to the outside in more the boundary of the image of at time . By choosing
than one interval of directions. small, we can make the gap arbitrarily small and easy to
bridge.
Shrinking
joins. We get a deformation retraction Bibliographic notes. Homeomorphisms, homotopies,
from to by shrink- and deformation retractions are covered in most texts of
* *
ing joins from outside in. Each join is the union of line algebraic topology, including Seifert and Threlfall [6] and
segments with on the boundary of and on the Munkres [5]. Subtleties of the definitions of a topology
IV.1 Equivalence of Spaces 57
Figure IV.7: Six snap-shots of the deformation retraction from the union of balls representation of gramicidin to the dual complex.
and of a topological space are discussed in texts on gen- les points fixes des représentations. J. Math. Pure Appl. 24
eral topology, including Kelley [2] and Munkres [4]. (1945), 95–167.
The particular deformation retraction used to prove the [4] J. R. M UNKRES . Topology. A First Course. Prentice Hall,
homotopy equivalence between a union of balls and its Englewood Cliffs, New Jersey, 1975.
dual complex is taken from Edelsbrunner [1]. That equiv-
[5] J. R. M UNKRES . Elements of Algebraic Topology. Addi-
alence can also be derived from general theorems about
son-Wesley, Redwood City, 1984.
coverings. The Nerve Lemma says that a space is homo-
topy equivalent to the nerve of a finite open cover whose [6] H. S EIFERT AND W. T HRELFALL . A Textbook of Topology.
sets have either empty or contractible common intersec- Academic Press, San Diego, California, 1980.
tions. We can turn the Voronoi cells of a union of balls
into such a cover and get the homotopy equivalence re-
sult from that lemma. The history of the Nerve Lemma is
complicated because different versions have been discov-
ered independently by different people. Maybe the paper
by Leray [3] is the first publication on that topic.
*
. The quotient divided by , denoted as
IV.2 Homology Groups , is the collection of cosets. Addition in the quotient
group is defined by *
*
This section introduces homology groups as an algebraic
means to characterize the connectivity of a topological . We note that it does not matter which representatives
space. To keep the discussion reasonably elementary, we we choose in computing the sum of the two cosets. The
*
restrict it to triangulated spaces and to addition modulo 2. resulting coset is always the same, so addition is indeed
well defined. Observe that implies
2
now develop. A simplex is the convex hull of an affinely
independent point set,
. If has cardinality
then has dimension and is also referred x+ H y+ H
to as a -simplex. A face of is the convex hull of a
H
subset , and we write . Since has sub-
sets, has the same number of faces, including the empty
0
set and as its two improper faces. A simplicial complex
is a finite collection of simplices with pairwise proper
intersections that is closed under the face relation, that is, Figure IV.8: Partition of into cosets defined by for the case
in which contains a quarter of the elements.
(i) if and
then
, and
then is either empty or a face of * * . So if * and then
(ii) if * . In words, two cosets are either disjoint or
both.
same cardinality and
that
the same. If is finite this implies
.
all cosets have the
Recall that the underlying space of
is the union of all
A homomorphism between groups and is a function
simplices, . A simplicial complex can be
that commutes with addition, *
*
used to represent a topological space, and we have seen
. The kernel of is the subset of whose
elements map to , and the image is the subset of
an example in Section II.3, where the dual complex of a
space-filling diagram was used to represent a molecule.
We proved in Section IV.1 that the underlying space of whose elements have preimages in :
the dual complex is homotopy equivalent to the space-
*
*
filling diagram. A topologically more accurate represen-
$*+
with *
tation would have a homeomorphic underlying space. We
thus define a triangulation of a topological space as a
simplicial complex whose underlying space is topolog-
ically equivalent, . The remainder of this section
An isomorphism is a bijective homomorphism. Its kernel
is the zero element of and its image is the entire .
introduces the algebraic concepts we will use to define ho-
mology groups of triangulated spaces. Chain complex. Let
be a simplicial complex. We
construct groups by defining what it means to add sets of
Abelian groups. A group is a set together with an as-
sociative operation for which there is a
simplices. Call a set of -simplices a -chain. By defini-
tion, the sum of two -chains is the symmetric difference
of the two sets,
zero and an inverse for every group element. The group is
abelian if the operation is commutative. Examples are the
infinite group of integers with addition,
, and the fi-
nite cyclic group of elements, mod . A subset
forms a subgroup if is a group.
This is like adding modulo 2 where , since a
is a subgroup. We chain belongs to
iff it belongs to neither or to both
* ,
Suppose chains. is the set of -chains and
have + , and because implies *
is abelian and is the group
there is a bijection between and each coset *
of -chains. The zero of this chain group is the empty
set. We connect chain groups of different dimensions by
IV.2 Homology Groups 59
. This assumes of course that and have the k+2 k+1 k k−1
same dimension, else would not be defined. We thus
,
have a boundary homomorphism 0 0 0
for every . The sequence of chain groups connected by
boundary homomorphisms is the chain complex of ,
Figure IV.9: The chain complex and the groups of cycles and
8
boundaries contained in the chain groups.
. If then
group, is the trivial
Figure IV.9 illustrates the sequence but contains informa-
group consisting only of one element. The size of is a
tion about subgroups that will be introduced shortly.
measure of how many -cycles are not -boundaries. The
cosets are the elements of and are referred to as homol-
Cycles and boundaries. There are two types of chains ogy classes.
that are particularly important to us: the ones without As an example consider a triangulated torus, as
boundary and the ones that bound. A -cycle is a -chain
sketched in Figure IV.10. All 0-chains are 0-cycles and
with . The set of -cycles is the kernel of the -
. Two -cycles half of them are 0-boundaries,
namely the ones with even
th boundary homomorphism,
cardinality. Hence
4
. The two non-
add up to another -cycle, which implies that
is
bounding 1-cycles labeled and generate a first homol-
a subgroup of
. A -boundary is a -chain for
ogy group of four elements, as shown in Figure IV.10. It
which there exists a -chain with . The set is isomorphic to , which is the group of elements
of -boundaries is the image of the
-st boundary
with component-wise addition
homomorphism,
. Two -boundaries add
modulo 2. There is only one non-empty 2-cycle, ,
up to another -boundary, which implies that
is a
and no non-empty 2-boundary, . Hence
subgroup of . We prove that is a subgroup
.
of
. Equivalently, the boundary of every boundary
is empty.
0 a b a+b
F UNDAMENTAL L EMMA OF H OMOLOGY. . 0 0 a b a+b
b
a a 0 a+b b
P ROOF. Note that for every
-simplex . b b a+b 0 a
This is because every -simplex belongs to exactly a
a+b a+b b a 0
two -simplices. The rest follows because taking bound-
ary commutes with adding:
Figure IV.10: The curves and represent the homology classes
and , which generate the homology group .
An important property of homology groups is that they
are the same for triangulations of homeomorphic and of
homotopy equivalent spaces. In particular, we get the
which is the empty set, as required. same homology groups for different triangulations of a
We can therefore draw the relationship between the sets topological space. Similarly, the homology groups of (any
of chains, cycles, and boundaries as sketched in Figure triangulation of) a union of balls are the same as the ho-
IV.9. mology groups of the dual complex. In other words, the
homology groups are properties of the space and not arti-
facts of the complexes used to represent that space. Prov-
Homology groups. The -th homology group is the quo- ing that this is really the case is beyond the scope of this
tient of the -th cycle group divided by the -th boundary book.
60 IV C ONNECTIVITY
Betti numbers. The most useful aspects of homology Since is a homomorphism,
,
groups are their ranks, which have intuitive interpretations and , we have
in terms of the connectivity of the space. The concept of
a rank applies equally well to chain, cycle, boundary and
. Earlier we derived
for every . Given a subset
Using corresponding lowercase letters for ranks, we
is, of such a
in the terminology of linear algebra, where the subgroup
is knows as the linear hull,
, consisting of all ,
with and . This subset is
it is minimal and generates the entire group,
a basis if
.
Even though there is no unique basis, all bases have the
same size, and because is idempotent, that size is the
binary logarithm of the number of group elements. By
definition, the rank of is the size of a basis:
. If the group is the -th homology
We state this result because it is important and so we can
group of a space, , the rank is known as the -th
Betti number of that space:
. Since use it for later reference.
we have
E ULER -P OINCAR E´T HEOREM . .
This relation can often be used to quickly find the Euler
Revisiting the example above, we see that the Betti num- characteristic of a space without constructing a triangula-
bers of the torus are , and . The tion and counting simplices. For example, the closed disk
homology groups of dimensions
are all trivial
has one component, no non-bounding loop, no shell, and
and the corresponding Betti
numbers are all zero. For the
of a finite set have even cardinality. If there are compo-
nents and
vertices then
and
8
. Bibliographic notes. Homology groups have been de-
It follows that
. Similar to veloped at the end of the nineteenth and the beginning of
, the 1-st and 2-nd Betti numbers have intuitive interpre- the twentieth centuries. The French mathematician Henri
tations as the number of independent non-bounding loops Poincar´e is usually credited with the conception of the idea
and the number of independent non-bounding shells. [4]. He named the ranks of the homology groups after the
English mathematician Betti, who introduced a slightly
different version of the numbers years earlier. The begin-
Euler characteristic. Consider a simplicial complex ning of the twentieth century witnessed parallel develop-
and let be the number of its -simplices. By defi- ments of homology groups that differed in the elements
these numbers:
nition, the Euler characteristic is the alternating sum of
. We show that is
they added (simplices, cubes, general cells, ...) and the co-
efficient groups they used ( , , , , ...). Eventually, all
also the alternating sum of Betti numbers. Note that if this work was unified by axiomizing the assumptions un-
is a homomorphism, then the rank of is der which homology groups exist [1]. Today, homology
equal to the sum of ranks of the kernel and the image. is a general method within algebraic topology. We refer
IV.2 Homology Groups 61
IV.3 Incremental Algorithm Observe that the four cases follow one and the same rule:
if belongs to a non-bounding cycle in then we in-
The Betti numbers of a simplicial complex can be com- crement the Betti number of the dimension of and,
puted incrementally, by adding one simplex at a time. otherwise, we decrement the Betti number of dimension
In this section, we describe the details of this algorithm,
one less than that of . This is justified by the equa-
tion
which is particularly well-suited for filtrations. developed in Section IV.2: adding
a -simplex always increments the rank of the -th chain
group, and it does this by either incrementing the rank of
Adding a simplex. We analyze what happens to the
the -th cycle group or that of the
-st boundary
we form a filtration that ends with that complex:
those of . In the case analysis, we mention only the Betti
numbers that change.
Case andis athus
vertex. Being a vertex, cannot connect to
All are complexes, and it is convenient to assume that
forms a component by itself. Therefore,
.
;
to do
for ;
The edge if belongs to a -cycle in then ++
Figure IV.11: closes a loop on the left and
connects two components on the right. else --
endif
Case is a triangle. Again we have two sub-cases, both endfor;
illustrated in Figure IV.12. If completes a 2-cycle
return .
then
. Otherwise, closes a
tunnel and we have . The only difficult part of the algorithm is deciding whether
or not belongs to a -cycle. We study this problem after
illustrating the algorithm for a small example.
in the latter case it destroys. All vertices create, but edges
can create or destroy. For example, the edge in Fig-
ure IV.11 creates on the left and destroys on the right. To
Figure IV.13: In the first step, we glue two sides of the triangle, distinguish between the two cases, we maintain the com-
thus forming a cone with a seam. In the second step, we glue the ponents of the complex throughout the filtration using a
seam along the rim of the cone (not shown).
union-find data structure, which represents a system of
pairwise disjoint sets: the elements are the vertices and
1 the sets are components of the complex at any moment
in time. The data structure supports three types of opera-
tions:
3 7 6 3
F IND
return the set that contains vertex .
2
8 5
2
U NION
substitute
for the sets and
in
the system.
9
4
D A DD
add as a new singleton set to the system.
A B C
1 2 3 1 The algorithm scans the filtration from left to right and
classifies each vertex and each edge as either creating or
Figure IV.14: A triangulation of the dunce cap. destroying:
to do
for
and finally all triangles. After adding the thirteen vertices, case is a vertex :
we have , and . The evolution creates; A DD ;
F IND
of Betti numbers while adding the edges in lexicographic case is an edge :
; F IND ;
order is shown in Table IV.1. There are 27 triangles in the
if then creates
destroys; U NION
12 13 16 17 19 1A 1C 1D 23 25 else
12 11 10 9 8 7 6 5 5 4
endif
0 0 0 0 0 0 0 0 1 1
endfor.
28 29 2A 2B 2D 35 36 37 38 3B
3 3 3 2 2 2 2 2 2 2
1 2 3 3 4 5 6 7 8 9
Standard implementations of the union-find data structure
3C 45 46 47 48 49 4A 4B 4C 4D
2 1 1 1 1 1 1 1 1 1 take barely more than constant
time per operation. To
10 10 11 12 13 14 15 16 17 18
be more precise, let
be the extremely fast growing
56 5D 67 78 89 9A AB BC CD
1 1 1 1 1 1 1 1 1 Ackermann function. Its inverse is extremely slow grow-
19 20 21 22 23 24 25 26 27 ing. To get a faint idea of how slow the inverse grows,
Table IV.1: Evolution of and while adding the edges of the
we note that
any constant, but
cannot be bounded from above by
unless is larger than
triangulation in Figure IV.14.
the estimated number of electrons in the universe. Any
sequence
of operations takes time at most proportional
triangulation, each closing a tunnel and thus decrementing
to . For all practical purposes, this means that
. Indeed, no collection of triangles has zero boundary,
each operation takes only constant time.
which can be proved by observing that three edges belong
to three triangles each and all other edges belong
to two
triangles each. The final result is therefore and Classifying triangles and tetrahedra. In three-dimen-
. Indeed, the dunce cap is connected, all sional Euclidean space, every tetrahedron destroys but tri-
its closed curves bound, and the surface formed by the angles can destroy or create. Deciding whether or not a
triangles does not enclose any volume in . triangle belongs to a cycle is not quite as straightforward
64 IV C ONNECTIVITY
as it is for an edge. However, with an extra assumption tetrahedra, but this is exactly what compactification does
on the filtration, we can use the dual graph of the com- for us when it adds tetrahedra outside the boundary tri-
plement to classify triangles and tetrahedra the same way angles of the Delaunay triangulation. The running time
as we classified edges and vertices. The most convenient for classifying all triangles and tetrahedra is again propor-
tional to .
version of this assumption is that the last complex in the
filtration,
, is a triangulation of . Think of as the
one-point compactification of . Given a Delaunay tri-
Summary. The entire algorithm consists of three passes
angulation in , we can construct such a triangulation by
adding a dummy vertex and connecting it to all bound- over the filtration:
ary simplices of the Delaunay triangulation.
1. a forward pass to classify all vertices and edges,
In and also in , every closed surface bounds a vol- 2. a backward pass to classify all triangles and tetrahe-
ume. In other words, a triangle completes a 2-cycle dra,
iff it decomposes a component of the complement into
two. We keep track of the connectivity of the complement 3. a forward pass to compute the Betti numbers.
through its dual graph, whose nodes are the tetrahedra and
Figure IV.16 illustrates the result of the algorithm. In the
whose arcs are the triangles. Figure IV.15 illustrates this
first two passes, we maintain a union-find
data structure,
construction in two dimensions. Adding a triangle to the
which takes time proportional to . The third
pass does only a constant amount of work per step, namely
incrementing or decrementing a counter. The
total running
time is therefore at most proportional to .
destroys, unless
, in which case it creates;
A DD ;
case is a triangle:
let and be the tetrahedra that share ; Bibliographic notes. The incremental algorithm for
F IND ; F IND ; computing Betti numbers described in this section is taken
if then destroys from [2]. It exploits the fact that the connectivity of
else creates; U NION the complex determines the connectivity of the comple-
endif ment. This relation is a manifestation of Alexander dual-
endfor. ity, which is studied in algebraic topology [3, Chapter 3].
This algorithm has been implemented as part of the Al-
The algorithm requires that each triangle is shared by two pha Shape software, which computes the Betti numbers of
IV.3 Incremental Algorithm 65
with -simplices . The
-th incidence matrix is
Figure IV.17: The effect of elementary row and column opera-
tions on the bases of and
.
..
..
..
..
.
. . . matrix, but it is still describes a correspondence between
bases of and
. The matrix is in normal form if
!
its non-zero entries are lined up along an initial segment
where iff is a face of . Using this notation, of the main diagonal, as illustrated in Figure IV.18. We
we can write the -th boundary homomorphism in matrix can use Gaussian elimination to transform the incidence
!
form: matrix into normal form.
:
to
for do
if NON Z ERO then
Recall that the form a basis of the -th chain group, forall rows
do
, and similarly the form a basis of . The above if
endfor;
then row
row row endif
formula thus expresses the boundary of every basis ele-
ment of as a sum of basis elements of
. To make forall columns
if
then col
do
col
col endif
this interpretation of the incidence matrix useful for com-
endfor
puting Betti numbers, we need to consider more general
endif
bases. These can be generated by performing elementary
endfor.
row and column operations:
The algorithm uses a boolean function NON Z ERO that
exchange row with row ;
makes sure that during the -th iteration the -th diagonal
add row to row ; entry,
, is non-zero. It does this by exchanging rows
exchange column with column ; and columns. The function fails to make non-zero iff
add column to column . all entries in the remaining sub-matrix are zero.
:
row
replacing by . (Since we deal with idempo- else find ;
row
tent groups, subtraction is the same as addition.) Note that
the effect is not symmetric: the basis of changes at endif
the modified row, while the basis of
changes at the endwhile;
return .
modifying column.
We use the phrase “assume without loss of generality”
Normal form algorithm. After a few elementary row as a short-form for expressing that there is another case,
and column operations, is no longer the -th incidence namely , that can be handled symmetrically. The al-
IV.4 Matrix Algorithm 67
,
.
gorithm consists of three nested loops. Letting this function as a formal polynomial:
the running time is therefore at most proportional to 4 4 4
where 4 is the function value of . We add two -
Deriving the Betti numbers. Suppose we have trans-
chains componentwise, by adding the coefficients of like
formed all incidence matrices of into normal form.
simplices:
As illustrated in Figure IV.18, the -th matrix has
4
rows and
columns. The zero-rows correspond to
-cycles, of which we have many. It follows that the
4
number of non-zero entries along the main diagonal is
. The -th Betti number is the rank of
By definition, the boundary of
is the
alternating sum of ordered -simplices obtained by
1 bk −1 ck −1
dropping one vertex at a time:
1
1
bk −1
where the hat marks the deleted vertex. We can check that
zk the boundary is independent of the ordering, as long as it
ck belongs to the same orientation, and that it is the nega-
tive boundary for an ordering of the opposite orientation:
Figure IV.18: The normal form of the -th incidence matrix. . Similarly, we can check that the Funda-
mental Lemma of Homology still holds: . As
before, we define the group of -chains, , the group
the -th cycle group minus the rank of the -th boundary
group: of -cycles, , and the group of -boundaries, . The
. We can thus derive
the Betti numbers from the sizes and numbers of non-zero -th homology group is again , and the
entries in the normal form matrices. -th Betti number is the rank of that homology group:
We note that the ranks of the incidence matrices suffice
.
for computing the Betti numbers and it is not necessary to
go all the way to normal form. Either way, the running Torsion. A curious new phenomenon that arises with the
time of the algorithm is cubic in the number of simplices
in the complex.
use of integer addition is algebraic torsion. It does not oc-
cur for spaces that can be embedded in , so it is not part
of people’s immediate experience. Maybe the simplest
topological space whose homology groups have torsion is
Integer coefficients. The matrix algorithm can be ex- the Klein bottle. It can be constructed from a rectangular
tended to coefficients in instead of . Before dis-
cussing the necessary modifications, we talk about what
this means in terms of adding simplices and chains. We 1 4 5 1
start at the beginning.
An ordered -simplex is an ordering of the vertices
2 3
of a -simplex, and we write
. Two
ordered simplices have the same orientation if their order- 3 2
ings differ by an even number of transpositions. Each sim-
plex has two orientations, except if it is a vertex, in which
1 4 5 1
case it has only one. To set the stage, we give each simplex
in an arbitrary but fixed orientation, and for a given ori- Figure IV.19: A triangulated rectangular piece of paper glued to
ented simplex , we write for the other orientation of form a Klein bottle.
the otherwise same simplex. A -chain is a function from
the -simplices to the integers. It is convenient to write piece of paper by gluing opposite sides as shown in Figure
68 IV C ONNECTIVITY
IV.19. Since it has torsion,we know that the Klein bot-
tle cannot be embedded in , and when we draw it, we
If we get a positive integer smaller than
gle column operation. Symmetrically, if
in a sin-
we get such
have to allow for a self-intersection. The 1-cycle marked a positive integer in a single row operation. Otherwise,
around the neck of the bottle does not bound, but twice we may assume that
divides both
and , and we
that 1-cycle bounds. This is what causes torsion. To de-
can make
zero with a row operation. By adding row
scribe the phenomenon more generally, we need the fact to row we keep unchanged and we change to
that every finitely generated abelian group is isomorphic
, which is not an integer multiple of . Now we
to a direct sum (Cartesian product) of copies of and of get a positive integer smaller than
in a single column
cyclic groups: operation, as before. Since
divides every entry in the
8 remaining sub-matrix, it will also divide the future non-
zero diagonal entries. Hence, the algorithm generates the
Furthermore, we may require that all are larger than one The running time of the algorithm is no longer guaran-
and that divides
fixes and the indices . The abelian group
for each . This extra condition
is thus the
teed to be at most cubic in the number of simplices. In-
deed, the sequence of operations is sensitive to the size
direct sum of a free subgroup, namely , and the rest,
of the integers that arise, and it is not even clear whether
namely
, which is referred to as its torsion
subgroup. The are the torsion coefficients. The rank of
or not it is polynomial in the input size. As for coef-
ficients, we can determine the homology groups directly
of , which is . For
the group is the number of copies from the normal forms of all incidence matrices. We get
the Klein bottle, we have , and the rank of the -th homology group from the -th and the
for addition modulo 2 and , -st normal form matrices: .
and for integer addition. We thus get different We get the torsion coefficients from the -st normal
Betti numbers for addition modulo 2 and for integer addi- form matrix: they are the diagonal entries that exceed one.
characteristic:
tion, but their alternating sums are both equal to the Euler
. Indeed, the
Euler-Poincar´e Theorem is true independent of the type Bibliographic notes. The matrix algorithm presented in
of coefficients we choose to define homology groups and this section is taken from [2, Chapter 1]. The normal form
Betti numbers. it uses is sometimes referred to as the Smith normal form
[3], and similarly, the algorithm is sometimes called the
Smith normal form algorithm. For integer coefficients, it
Algorithm revisited. The normal form of a bases tran- is unclear whether or not its running time is polynomial
sition matrix is the same as before, except that we now in the input size. However, it is possible to modify the al-
allow entries in the main diagonal that are neither zero
gorithm to guarantee polynomial running time [1, 4]. The
nor one. Specifically, the initial sequence of ones is fol- Betti numbers obtained for and (or other coefficient
lowed by integers
, all larger than one, such groups) are not necessarily the same, but their differences
that divides
, for each . We modify the above are predictable and described by the Universal Coefficient
algorithm to transform the incidence matrix into normal Theorem of Homology [2, Chapter 7].
form. First we extend the elementary row and column op-
erations by allowing the multiplication of entire rows or [1] R. K ANNAN AND A. BACHEM . Polynomial algorithms for
columns by non-zero integers. A more substantial mod- computing the Smith and Hermite normal forms of an inte-
ification is needed within the function NON Z ERO, which
ger matrix. SIAM J. Comput. 8 (1979), 499–507.
now attempts to turn the next diagonal entry, , into the
smallest positive entry achievable by row and column op- [2] J. R. M UNKRES . Elements of Algebraic Topology. Addi-
erations. Unless the entire remaining sub-matrix is zero, son-Wesley, Redwood City, California, 1984.
this attempt will be successful and will divide every
[3] H. J. S MITH . On systems of indeterminate equations and
entry in the sub-matrix. To see this property, assume there
congruences. Philos. Trans. 151 (1861), 293–326.
is an entry , with , that is not an integer multiple
[4] A. S TORJOHANN . Near optimal algorithm for computing
of :
Smith normal forms of integer matrices. In “Proc. Internat.
.. .. .. Sympos. Symbol. Algebraic Comput., 1997”, 267–274.
. . .
Exercises 69
the same homotopy type. angles or higher-dimensional simplices. Let be the
number of vertices and the number of edges. Use
3. Joins and simplices. A tetrahedron can be defined the language of homology groups to re-confirm the
as the join of two skew line segments in space. The following formulas, which are well-known for sim-
halfway plane is parallel to both line segments and ple graphs:
lies exactly halfway between them. Since the line
(i)
if the graph is a tree.
segments are skew, the halfway plane separates the
(ii)
if the graph is connected.
two line segments.
(iii) in general.
(i) Show that the halfway plane intersects the tetra-
hedron in a parallelogram.
(ii) Decomposing the line segments into and
7. Protein structure. Download a protein structure
from the pdb database and use the Alpha Shape soft-
pieces implies a decomposition of the tetrahe-
dron into joins, which are smaller tetrahedra.
ware to compute the Betti numbers of its van der
Waals and its solvent accessible diagrams.
Draw the decomposition and highlight the in-
tersection with the halfway plane.
4. Stars and links. Let
be the dual complex of a
finite collection of balls in . Define the star of
a vertex as the collection of simplices that
contain , and the link as the collection of faces of
simplices in the star that do not belong to the star:
(i) Show that
is a complex, that is, every face
of a simplex in the link also belongs to the link.
70 IV C ONNECTIVITY
Chapter V
Shape Features
The topological analysis of spaces, as discussed in this idea seems simple enough, the details are tricky and
Chapter IV, is an important first step, but by itself is in- require that we use what we learned about pockets and
sufficient to appropriately characterize the shape of pro- topological persistence. Finally, in Section V.4, we illus-
tein structures. To decide what is appropriate, we need to trate the concepts using the Alpha Shape software and ex-
have a purpose. The goal we have in mind is understand- tensions.
ing how proteins interact with each other and with other
molecules. There is overwhelming evidence that interest-
ing events in such interactions happen preferably in cavi-
ties, which are partially protected regions in the protein or V.1 Pockets
molecular assembly, and that local shape complementar- V.2 Topological Persistence
ity plays a significant role in making such events happen. V.3 Molecular Interfaces
It appears that organic life is based on computations per- V.4 Software for Shape Features
formed by dynamically matching the (changing) pieces of Exercises
a three-dimensional puzzle. A statement like this needs
to be accompanied by a series disclaimers: not every in-
teraction is based on shape complementarity; interactions
that are based on shape complementarity are not entirely
so; and the relevant shape complementarity is local and
imperfect. In other words, the situation is hopelessly com-
plicated.
Our goal in this chapter is to introduce mathematical
and computational methods that allow us to start talking
about the real problem in more precise terms. We do this
be introducing three essentially new concepts. In Section
V.1, we make an attempt to give a precise meaning to cav-
ities in proteins. The main idea here is to combine the
topological concept of a hole with a minimum amount of
geometric information, and this information is the evolu-
tion of the shape under growth. In Section V.2, we return
to homology groups and introduce the concept of topo-
logical persistence. It is a measure of how important a
topological feature is during the evolution. We see this as
a tool to cope with imperfections as it permits us to distin-
guish topological features from topological noise. In Sec-
tion V.3, we make an attempt to give a precise meaning to
interfaces between interacting molecules. We define it as
a two-dimensional sheet separating the molecule. While
71
72 V S HAPE F EATURES
sentation of a molecule. Since is finite, the balls cannot what we do and do not call a pocket. According to this
the entire space, which implies that the complement,
cover
,C
model, the center of the ball in remains
, consists of one or more connected components.
Exactly on component is unbounded (infinitely large), and
,
fixed and the radius at time
root of
is equal to the square
. We may think of the growth as pushing the
all other components are voids. See Figure V.1 for an il- points on the boundary of the space-filling diagram out-
lustration of the definition in two dimensions. Recall that wards, in the direction normal to the surface. Figure V.2
illustrates this view in two dimensions. In the interior of
Figure V.1: The union of disks has a single (shaded) void. The
corresponding void in the dual complex consists of five triangles.
Figure V.2: The growing disks push the points on the boundary
in Chapter II, we described a deformation retraction from
the space-filling diagram, , to the dual complex, . The
outwards, in normal direction. Following the vectors, the points
in the shaded region have paths that end at Voronoi vertices.
plain existence of that retraction implies that for each void
in we have a void in that contains the void in . In- the Voronoi cells, the vector field is defined by the sweep-
deed, we can reverse the deformation retraction to show ing spheres. We extend it to the rest of space by using
that the two voids have the same homotopy type. Since the circles that sweep out the Voronoi polygons and the
the dual complex is a subcomplex of the Delaunay trian- intervals that sweep out the Voronoi edges. Starting at a
gulation, we may think of each void in as a collection point outside the space-filling diagram, we follow vectors
of tetrahedra, . The boundary is a col-
and thus form a path that may or may not go to infinity.
lection of triangles in . This collection bounds in but
*
We define a pocket
as a connected component of the set of
not in . It follows that represents a homology class in
the second homology group of . Indeed, the boundaries
points whose paths do not go to infinity. The
points that flow to infinity form a single component, which
of the voids form a basis of that homology group. Hence,
is the number of voids in , which is the same as the
we refer to as the outside. Each pocket is open where it
borders the space-filling diagram and closed where it bor-
number of voids in . ders the outside. The latter set of points may formally be
defined as the intersection of the pocket with the closure
of the outside. Its connected components are open two-
Definition of pockets. A pocket generalizes the concept dimensional sets, which we refer to as the mouths of the
of a void by relaxing the requirement it be disconnected pocket. Note that voids are pockets without mouths.
V.1 Pockets 73
the Voronoi diagram. There are ten cases distinguished by
the dimension of the dual Delaunay simplex, , and the C2
relative position of its orthocenter, . We recall that is
the point at which the affine hull of intersects the affine M2 C2
hull of its dual in the Voronoi diagram.
Figure V.4: The thin solid lines represent polygons that meet
along a common edge in space. That edge appears as a solid dot,
Case M : is a vertex and the orthocenter lies
which marks the orthocenter of the triangle. From left to right,
,
in the interior of the corresponding Voronoi cell, .
the orthocenter lies inside the triangle, lies outside and sees one
This cell is encountered at time , which is the
edge, lies outside and sees two edges and their shared vertex.
moment when the -th ball changes from imaginary
to real radius.
Case is an edge and lies in the interior of the cor- Case C :
. Here we have two sub-cases de-
pending on whether sees one or two edges
responding Voronoi polygon. There are two generic
sub-cases, both illustrated in Figure V.3. from the outside. In the first case, the three
balls touch the Voronoi edge at the same mo-
ment they encounter the Voronoi polygon dual
to the visible edge. In the second case, the balls
touch the edge at the same moment they en-
counter the two polygons and one cell dual to
the two visible edges and the vertex they share.
M1 C1
Case is a tetrahedron. Its orthocenter is necessarily the
corresponding Voronoi vertex.
Figure V.3: The vertical lines are side views of polygons in Case M :
. The four balls completely sur-
space. The solid dot marks the orthocenter of the Delaunay edge. round the Voronoi vertex before they reach it.
On the left, this edge intersects its dual Voronoi polygon, while Case C : . Here we have three sub-cases de-
on the right, it lies on ones side of the polygon. pending on whether sees one, two or three tri-
angles from the outside. The four balls touch
Case M :
. The two balls approach the
the Voronoi vertex at the same moment they
touch the Voronoi edges, polygons and cells
Voronoi polygon from both sides, eventually
that correspond to the triangles, edges and ver-
touching it at . tices visible from .
Case C : . The two balls approach the poly-
gon from the same side. At the moment they In Case C and in the last sub-case each of Cases C
touch, the smaller ball breaks through the outer and C , sees a vertex of from the outside. Assuming
sphere and starts sweeping out the Voronoi cell lies outside the space-filling diagram, this is only pos-
on the other side of the polygon. sible if the ball centered at that vertex is contained inside
Case is a triangle and lies in the interior of the corre-
the union of the balls centered at the other vertices of .
This is unlikely to happen for molecular data and usually
sponding Voronoi edge. There are three generic sub- indicates a measurement or modeling mistake.
cases, all illustrated in Figure V.4.
Case M :
. The three balls completely sur- Metamorphoses and collapses. In four of the ten cases,
round the Voronoi edge before they touch at . only one simplex is added to the dual complex, namely in
74 V S HAPE F EATURES
Cases M , M , M and M . Consistent with the discussion the flow along normal vectors. We are only interested in
tetrahedra. As noted in Case C , if the orthocenter of
in Chapter III, we call these operations metamorphoses,
since they change the homotopy type. We will see shortly a Delaunay tetrahedron lies outside then it sees ei-
that the remaining six cases do not affect the homotopy ther one, two or three of the triangles. For each triangle
type. They can be understood as inverses of the six types visible from , we define , where is the tetrahe-
of collapses illustrated in Figure V.5. Recall that a princi- dron on the other side of the shared triangle. To cover the
case in which the triangle lies on the boundary of the De-
launay triangulation, we introduce a dummy tetrahedron,
, that represents the space outside the triangulation. By
definition, its orthocenter is at infinity, so can only be a
successor but not a predecessor of other tetrahedra. This
is what we call a sink of the relation. The other sinks are
23−collapse 13−collapse 03−collapse the tetrahedra that contain their orthocenters; they define
metamorphoses in the evolution of the dual complex.
Note that implies that the square radius of the or-
thosphere of is less than that of the orthosphere of . If
, this is true because the orthoradius of is infinity,
12−collapse 02−collapse
by definition. If and are both (finite) Delaunay tetra-
hedra, this is true because their orthocenters are Voronoi
01−collapse vertices that lie on the same side of the plane separating
and . As illustrated in Figure V.6, the two orthospheres
Figure V.5: From left to right, top to bottom: collapsing a tetra- intersect in a circle that lies in the separating plane and the
hedron from a triangle, an edge and a vertex, collapsing a triangle orthocenter of is further from that plane than the ortho-
from an edge and a vertex, and collapsing an edge from a vertex. center of . This implies that the square radius increases
In each case, the collapse removes the tetrahedron, the transpar- along every chain of the relation. Hence, is acyclic and
ent triangles, the dashed edges, and the dotted vertices, if any. its transitive closure is transitive.
pal simplex is not face of any other simplex in the com-
plex. A proper face of a principal simplex is free if all
simplices that contain are faces of . Such a pair
defines a collapse, which is the operation that removes all
simplices between and including and . Formally, the
complex obtained from by collapsing the pair
is
. It is convenient to specify
the type using the dimensions
and to talk about -collapses, for
and
. With
this notation, the changes in the dual complex described
in Case C are caused by inverses of -collapses, for
. Figure V.6: Think of the triangles as projections of tetrahedra
and the circles of projections of spheres. The centers of both
Each collapse can be realized as a deformation retrac-
(dotted) orthospheres lie on the right of the separating plane.
tion that pushes a portion of ’s boundary through to-
ward the remaining portion of the boundary. In the pro-
cess, the retraction removes and all faces of that con-
tain . Being a deformation retraction, the operation does Pockets of dual complex. We are now ready to define
not affect the homotopy type of the complex, and neither and compute the pockets of the dual complex using the
does its inverse. partial order over the tetrahedra. The ancestor set of a
tetrahedron contains , its predecessors, the
predecessors of the predecessors, and so on:
Partial order. Using the classification into ten different
operations, we may introduce a partial order on the De-
launay simplices, which we think of as a discretization of
V.1 Pockets 75
We have seen that a tetrahedron can have more than one complex. Based on this adjacency information, we can
successor. It is also possible that it belongs to more than compute the connected components using standard graph
one ancestor set, although this is not the common case. algorithms, such as depth-first search or union-find. Com-
The pockets in the dual complex are defined by the tetra- puting mouths is similar to computing pockets, only one
hedra that neither belong to the dual complex nor to the dimension lower.
ancestor set of . Note that this is more conservative than
collecting all tetrahedra outside that belong to ancestor Step 1. Collect the boundary triangles not in
.
sets of finite sinks. We compute the pockets in two steps:
Step 2. Partition this collection into components.
Step 1. Collect the tetrahedra in
.
We may do the computation for individual pockets or for
Step 2. Partition this collection into components.
all pockets at once. In Step 1, we collect the triangles
in that belong to exactly one pocket tetrahedron.
To collect the tetrahedra, we assume the Delaunay sim-
plices are given in a list ordered by birth-time. As il-
In Step 2, we call two triangles adjacent if they share
an edge does not belong to . Finally, we use the same
lustrated in Figure V.7, the relation over the tetrahedra standard graph algorithms to compute components.
is acyclic and goes monotonically from left to right. We
for the corresponding filtration. The are the complexes Incremental algorithm revisited. We will formalize
that arise during the evolution and, in the generic case, any the idea of pairing creations with destructions by revisiting
two contiguous complexes differ either by a metamorpho- the incremental algorithm for Betti numbers presented in
sis or an anti-collapse. Each anti-collapse may be viewed Section IV.3. We study the algorithm in terms of matrices
as a sequence of metamorphoses in which the later sim- of boundary homomorphisms. Recall that a single step in
plices destroy the topological features created by the ear-
triangle creating a void and a tetrahedron filling the same. Let the dimension of be . The only matri-
The life-time of this void is zero because the triangle and
ces affected by adding to the complex are the ones of
the tetrahedron are added at the same moment. We will see
that even if a triangle and a tetrahedron are added at dif-
and of
, which are
displayed in Figure V.10. The new column of the matrix
ferent moments, it is possible to decide in an unambiguous
manner whether or not the tetrahedron destroys what the
triangle created. If it does, then we are talking about a void Ck C k −1
with positive life-time, and we may interpret that life-time
as a measure of significance of the void. We may also in-
terpret it as a shape measure of the corresponding pocket. C k +1 0 Ck
of is zero because is not a face of any -
M2
M0 M0
simplex in . Hence , the rank of the -th boundary
group, is the same for as it is for . On the other
M1
hand,
may remain the same or it may increase.
Case
creates. Then
belongs to a -cycle, which
Figure V.9: The region grows from two vertices, the two com- implies that its row in the matrix of can be ze-
ponents merge twice, and the second merge creates a void that roed out. We can thus write the Betti numbers of
eventually disappears.
in terms of the ranks of various groups defined for
as follows:
Section IV.3 and depends on the effect on the Betti num-
bers: a -simplex creates if its addition increases and
V.2 Topological Persistence 77
In words, the -st Betti number remains un- index of the row, among the first rows, for which is
Case
changed and the -th Betti number increases by one. the last column. It returns zero if the row is not defined.
destroys. Then does not belong to a -cycle.
boolean DOES C REATE int
Its row in the matrix of can therefore not be zeroed
out and we get a new non-zero entry in the normal
form of that matrix. Hence,
while
if
L AST C OL
ROW
then row
do
row row
else return FALSE
endif
endwhile;
return TRUE.
In words, the -st Betti number decreases by
one and the -th Betti number remains unchanged. After running Function DOES C REATE for the -th row, that
row is either zero, in which case the corresponding sim-
The case analysis confirms that the incremental algorithm plex creates, or it has a unique last column, in which
as described in Section IV.3 computes the Betti numbers case destroys.
correctly.
row operations, columns in the matrix of correspond to when simplices are added to the complex in the filtration,
individual
-simplices and rows represent - but to simplify matters here, we re-define time equal to
cycles. When we add , we attempt to zero out its row the index. In other words, we say is added at time .
from right to left. To describe how this is done, we call
the column of the rightmost non-zero entry in a row its
Keeping this convention in mind, we now define the -
persistent -th homology group of
as the cycle group
last column, and we assume a function L AST C OL that re- divided by the boundary group at positions later in
turns the index of the last column; it returns zero if that the filtration:
last column does not exist. Clearly, each row has at most
one last column. Conversely, we maintain inductively that
each column is last for at most one row. For example, this
property is satisfied by the matrix in Figure V.11 before Taking the intersection of the boundary group with the
the shaded last row is added. After that addition, we use cycle group is necessary for technical reasons to define
row operations to reinstate the property before adding the
next row. To explain the algorithm, we let be the index of
the quotient group. Figure V.12 illustrates the difference
Zj
1
1 1 1
1 1 1
1 1 1 1
0 B j+p
Bj
the row that corresponds to the new simplex . Given a
Figure V.12: The cycle group and its decompositions into solid
-persistent homology classes and dotted 0-persistent homology
column , we also assume a function ROW that returns the classes.
78 V S HAPE F EATURES
between the -persistent homology group and the usual We illustrate this property by drawing a right-angled
or 0-persistent homology group. The -persistent -th isosceles triangle below every interval, as shown in Fig-
Betti number is the rank of the -persistent -th homol- ure V.13. Each triangle is closed along the top and left
ogy group:
.
-th Betti number of
edges but open along the hypotenuse. The -persistence
is represented by the point
in the index-persistence plane. According to the Interval
Interval property of persistence. We develop an intu- Property, it is the number of right-angled isosceles trian-
itive picture of persistence using the distinction between gles that contain this point.
creating and destroying simplices. Note that the number
of creating -simplices until position in the filtration is
Pairing. The pairing of simplices to obtain intervals sat-
the rank of the cycle group:
. Similarly,
isfying the Interval Property is done using Function DOE -
the number of destroying -simplices is the rank
S C REATE explained above. Specifically, each destroying
of the boundary group:
. The Betti num-
-simplex corresponds to a non-zero row in the matrix of
ber is the surplus of creating versus destroying simplices:
and is paired with the -simplex that corresponds
. Because Betti numbers are non-negative,
to the last column in that row. Note that this -
the creating -simplices and destroying -simplices
simplex indeed creates, as it witnessed by the cycle repre-
are arranged like opening and closing parentheses in an
sented by the row. The persistence of a pair is the
expression, except that some closing parentheses may be
time-lag between the additions of the two simplices to the
missing at the end. In particular, every prefix contains at
complex in the filtration. In the assumed simplified case
least as many creating -simplices as destroying -
in which is added at time , the persistence is the dif-
simplices. We can therefore pair them up and form vertex
ference between indices: . This is the convention we
disjoint intervals, each starting at the position of a creat-
used to generate Figure IV.16, which shows the persistent
ing -simplex and ending at the position of a destroying
first Betti numbers of the space-filling diagram modeling
-simplex (or extending to infinity if there are no
the gramicidin protein.
destroying simplices left). We use intervals that are closed
to the left and open to the right. The Betti number at posi-
tion is then the number of intervals that contain . Any 6
arbitrary pairing creating vertex disjoint intervals has this 5
4
property for Betti numbers. (Can you prove that?) In con- 3
2
trast, there is exactly one pairing that has the following 1
0
stronger property for persistent Betti numbers:
0
ber at position is the number of intervals that si- 2000
4000 9000
8000
7000
5000 6000
5000
4000
3000
6000 2000
1000
0
[ )
[ ) [ )
index scale for gramicidin. The index in the filtration varies from left
to right and the persistence from back to front. Observe the large
triangular plateau, which corresponds to the dominant tunnel that
passes through gramicidin.
persistence
V.3 Molecular Interfaces bi-chromatic polygons and their edges and vertices. Fig-
ure V.15 illustrates the definition by showing the interface
The interface between two or more interacting molecules of two collections of disks in the plane.
is the location of that interaction. In this section, we
present a proposal for a surface or complex of surfaces that Local structure. In the generic case, every edge belongs
geometrically represents that interface. One of its applica-
to three and every vertex to four Voronoi cells. This im-
tions is to display functions defined over the interface. plies that for colors, the interface has a particularly
simple local geometric structure. An interface edge be-
longs to two cells of one and to one cell of the other color,
Interfaces without boundary. Our definition of a mo- and exactly two of the three polygons sharing the edge are
lecular interface is a formalization of two intuitions, bi-chromatic and thus belong to the interface. There are
namely that the best separation of two or more molecules two types of interface vertices: those that belong to three
is part of the Voronoi diagram and that the interesting por- cells of one and one cell of the other color and those that
tion of that separation is protected by a relatively tight seal. belong to two cells of each color. As illustrated in Figure
We will come back to the second intuition later and for- V.16, the local neighborhood of both types of vertices is
malize the first intuition now. a topological disk. We conclude that in the generic case
Figure V.16: The shaded polygons and their edges belong to the
interface. On the left, we have three cells of one and one cell of
the other color. On the right, we have two cells of each color.
topological space in which every point has an open neigh-
borhood homeomorphic to . By construction, that 2-
manifold is orientable, with the cells of one color on one
side and the cells of the other color on the other side.
Figure V.15: The solid bi-chromatic edges form the interface of
the two collections of disks. The dotted mono-chromatic edges For colors, the local structure of the interface can
show the rest of the Voronoi diagram. be more complicated because we may have tri-chromatic
edges and tri- and four-chromatic vertices. For any two
molecules, each repre-
Consider an assembly of
colors, we get a 2-manifold, but now these 2-manifolds
Retraction. As defined above, the interface may go to We may think of a retraction as successively removing
infinity, which is sometimes a disadvantage. Our goal here sinks from an acyclic directed graph. It follows that the
is to shrink the interface back to where the molecules are result of the operation is independent of the sequence in
sufficiently close to interact. It seems natural to do this which the collapses are performed.
with a distance threshold, but this would most certainly
lead to the deletion of interior portions and produce frac-
tured surfaces. We therefore shrink from outside in and Clipping. The result of the retraction is the collection of
use relative rather than absolute distance measurements to tetrahedra in the dual complex together with the tetrahe-
decide where to stop the process. In the first step, we re- dra in the pockets. We further remove all mono-chromatic
tract the interface back to the multi-chromatic dual of the tetrahedra and let denote the remaining collection of
dual complex and its pockets. In the second step, we use multi-chromatic tetrahedra. The interface is now obtained
topological persistence to shrink the interface even further. as the dual of . More specifically, for each bi-chromatic
We will return to the second step later. edge of the tetrahedra in , we add the dual polygon to
launay triangulation
To describe the shrinking process, we consider the De-
of the collection of balls . We
the interface. There are, however, complications because
such a bi-chromatic edge may either be completely or only
have mono-chromatic vertices and mono- as well as multi- partially surrounded by tetrahedra in . In the latter case,
chromatic edges, triangles and tetrahedra. The interface we clip the polygon before adding it to the interface. Fig-
as defined above is dual to the subset of multi-chromatic
simplices in . Note that the first step of the shrinking
ure V.17 illustrates this idea in two dimensions, but we
should keep in mind that the situation in three dimensions
process is equivalent to removing all tetrahedra outside the is more complicated. A partially surrounded bi-chromatic
dual complex that belong to the ancestor set of the dummy
tetrahedron, which represents the space outside the Delau-
nay triangulation. We use 23-collapses to remove these
tetrahedra. We simplify the algorithm by ignoring prin-
cipal triangles, edges and vertices; in other words, we
delete principal triangles, edges and vertices as soon as
they arise. Let denote the dual complex.
void C OLLAPSE
:
if and is collapsible then
forall faces do delete endfor
endif.
part of an anti-collapse in the construction of the filtration
and the collapse of and renders the other simplices in
this anti-collapse principal. This is equivalent to saying
that the effect of the 23-collapse is the inverse of that anti- Figure V.17: The triangles drawn with solid edges are the bi-
collapse. We define a retraction as a maximal sequence chromatic triangles constructed by the contraction algorithm.
of collapses. In other words, we collapse as long as we The boldface interface is dual to and clipped at the boundary
can. In the implementation of this operation, we maintain of this collection.
a stack of candidate pairs. Initially, this stack contains all
boundary triangles of the Delaunay triangulation together edge corresponds to a polygon with two types of vertices:
with their incident tetrahedra. During the process, we take those dual to tetrahedra in and the others. We clip the
pairs from the stack and add new pairs whenever we create polygon by cutting each edge connecting vertices of dif-
new boundary triangles by collapsing. ferent types with the plane of the corresponding boundary
triangle. If that plane does not intersect the dual Voronoi
Complex R ETRACT : edge, which happens in rare cases, we clip at the endpoint
while the stack is non-empty do that is closer to the plane. Finally, we connect the cut
P OP; C OLLAPSE points in contiguous pairs and retain the portions of the
endwhile. polygon with vertices of the first type.
82 V S HAPE F EATURES
Further retraction. We now take the shrinking process on the boundary of the current set . We may start with
beyond the retraction from the dummy tetrahedron. Re- the set of all Delaunay tetrahedra.
call that the topological persistence algorithm of Section
Complex R ETRACT M ORE
V.2 generates simplex pairs with the property that :
destroys what created. The dimension of is one while the stack is non-empty do
larger than that of , but we are only interested in the case
if
P OP ;
then R EMOVE
in which is a triangle and is a tetrahedron. We think endif
of the operation that removes and as a generalization endwhile.
of a 23-collapse, but it is more complicated because is
generally not a face of , although it can be. We do the As before, we get the interface by duality from the com-
operation only if is a boundary triangle of and does puted collection of tetrahedra. The running time is dom-
not belong to the dual complex. We first delete and then inated by the topological persistence algorithm, which
retract from . As before, we remove principal triangles, takes cubic time to form the triangle-tetrahedron pairs.
edges and vertices as soon as they get created. With some care, we can implement the rest of the algo-
rithm so it takes only constant time per simplex in the De-
void R EMOVE :
launay triangulation.
if then delete ;
forall triangles
do P USH endfor;
We note that it is possible to use other functions that
satisfy the monotonicity property (V.1). For example, we
R ETRACT
endif. tetrahedra by using
may bias the shrinking process against large triangles and
. A second potential
advantage of this function over the inverse of the persis-
Here, is the tetrahedron that shares with . If the re- tence is that it is dimensionless and thus amenable to the
traction from reaches far enough, gets deleted just be- use of universally meaningful constant thresholds.
cause it becomes principal. However, it can happen that
the retraction does not reach all the way, in which case
we recurse for other pairs of simplices before deleting . Global structure. Note that we may get different in-
This is done implicitly during the retraction. To decide terfaces for different values of the threshold . Since a
whether or not to remove and in the first place, we smaller threshold permits as many or more removals than
compare their persistence with a constant threshold and
remove only if
. Here, ,
. Indeed, if we use
a larger threshold, the interface shrinks with decreasing
, we get a filtration
,
and
for we have
are the moments when and are born. Note that
alpha shapes. For
that is parametrized in a way similar to the sequence of
, the interface is the original sur-
(V.1)
face or complex defined by the set of bi-chromatic Voronoi
polygons. For , the interface is empty, unless the
dual complex of contains bi-chromatic triangles, which
This monotonicity property is important for the correct-
would remain. In this case, we can further decrease the in-
ness of the algorithm because if the retraction from does
terface by making negative, but we have to modify the
not reach then this can only be because there is a triangle
retraction to allow for collapses of simplices in the dual
between and that split the void created by before
complex. Eventually, for , the interface is guar-
it was destroyed by . But then the other part of the void
must have been destroyed by a tetrahedron preceding
in the filtration. In other words,
anteed to be empty.
, where For a fixed , the interface is a two-dimensional com-
and are the moments when and are born. The plex. Its two-dimensional elements are sheets defined
monotonicity guarantees that the simplices between and by bi-chromatic Voronoi polygons. There are two kinds
are removed by recursive deletions so that can even- of one-dimensional elements: the original tri-chromatic
tually be deleted. We now restate the algorithm and sim- curves and the new bi-chromatic curves outlining the sheet
plify its description by declaring a 23-collapse as a special boundary created by shrinking. Finally, there are two
case of a removal. Because of our policy to delete prin- kinds of zero-dimensional elements, namely the original
cipal triangles, edges and vertices, all other collapses can four-chromatic vertices and the new tri-chromatic vertices
be ignored. The algorithm maintains a stack of triangle- forming the curve boundary created by shrinking. We take
tetrahedron pairs formed by the topological persistence al- all sheets and curves as open sets so the complex is a col-
gorithm. Initially, the stack contains all pairs with lection of pairwise disjoint open elements. Note, however,
V.3 Molecular Interfaces 83
holes is
where , and are the number of vertices, edges and
triangles of any arbitrary triangulation of the 2-manifold.
Given a sheet, it is easy to compute its Euler characteristic
genus as
and to determine its number
of holes. We then get the
. We may think of this manifold
as obtained by punching holes into a -fold torus.
12
10
8
6
4
2
0
0
5000
10000
15000
20000
25000 45000
40000
35000
30000
30000 25000
20000
15000
35000 10000
5000
0
can be computed in the Alpha Shape software by first se- presence or absence of boundary triangles by the choice
lecting and and second pushing the ‘Difference’ but- of color. The mouth regions are therefore visually eas-
ton in the scene panel. The results are not encouraging ily identifiable. However, the internal connectivity of the
because a typically large number of inessential simplices pockets is not immediately visible, which may lead to con-
clutters the view of important cavities. In contrast, the fusion. For example, two pockets may appear connected
dual set of a pocket usually gives a clear indication of the but are not because of missing shared triangles. It is pos-
cavity, as in Figure V.21. The interface also supports the sible to visually inspect the connectivity by turning on the
display of simplices of all dimensions in the scene panel,
as shown in Figure II.17, and using the explosion func-
tion to separate all simplices. We observe the same phe-
nomenon for the mouths of a pocket. Two boundary trian-
gles that share a common edge may or may not belong to
the same mouth depending on which shared edges belong
to the pocket.
cestor sets of tetrahedra whose indices are larger than or
the reverse of that for the molecule, whose dual complex
equal to . In other words, all tetrahedra , with ,
is considerably smaller than a corresponding space-filling are treated like in the computation of pockets. This elim-
diagram.
ination of large pockets helps in the exploration of detailed
Remember that pockets in the dual complex are not structures, such as side pockets of larger pockets. An ex-
86 V S HAPE F EATURES
ample is shown in Figure V.24, which shows the pockets fifty-one proteins and their cavity structure. The most in-
filling the system of narrow tunnels visible in the second teresting outcome of that study is perhaps that in about
view in Figure V.18, but with set such that the system 80% of the cases, the pocket with the largest volume is
of wider tunnels visible in the third view of Figure V.18 also the biologically active site of the molecule. In many
are still open. The pockets thus only fill the remains of the instances, the largest pocket is assisted in its function by
narrow tunnels, and as can be seen in the first view, these smaller auxiliary pockets in the vicinity. In another appli-
remains are not connected. cation, Liang and Dill [2] provide numerical evidence that
proteins are packed tighter in the core than near the out-
side. The interface software has been developed by Yih-
En (Andrew) Ban but is not yet complete. It is built on
top of the Alpha Shapes software but requires a variety
of additional features to be useful to biologists. Some of
these features can be seen in visualizations of interfaces
presented in this section.
2. Ancestor sets in the plane. Consider the Delaunay
(i) Given a pairing, let be the sum of lengths of
triangulation of a finite points set in . Write the pairs. Prove that .
if the two Delaunay triangles share an edge and both (ii) Prove that depends on the given sequence but
orthocenters lie on ’s side of that edge. not on the pairing.
4
(i) Prove that is a partial order.
(ii) Prove that the ancestor sets of any two different 4
7. Sperner’s Lemma. Let
be a triangle and a
* * *
triangulation of . The label of a vertex in that
sinks in the order are disjoint.
4 4
tex. Clearly, if
sequence of collapses that reduces it to a single ver-
is collapsible then its underlying
number of triangles with three different labels
is odd.
space is contractible.
(iii) What would be a natural generalization of these
results from a triangle to a tetrahedron?
(i) Prove that if is embedded in then
is collapsible iff its underlying space is con-
8. 2-manifolds. Recall that a 2-manifold is a topologi-
tractible.
cal space in which every point has an open neighbor-
(ii) Give an example of a simplicial complex em- hood homeomorphic to .
bedded in that is not collapsible but whose
(i) Show that a two-dimensional simplicial com-
underlying space is contractible.
plex in which every edge belongs to exactly two
4. Barycentric subdivision.
complex and let
be a simplicial
denote its barycentric subdi-
Let triangles is not necessarily a 2-manifold.
(ii) Show that a simplicial complex in which the
vision.
closed star of every vertex is the triangulation
(i) Show that each -simplex in gives rise to of a disk is necessarily a 2-manifold.
-simplices in , for .
(ii) Prove that the Euler characteristic of
are the same.
and
Density Maps
89
90 VI D ENSITY M APS
press these restrictions.
and smooth functions over these manifolds. The primary
goal is to find out about the topological type of the mani-
A map from an open set to another open set
folds through a differential analysis of the functions. The is smooth if the partial derivatives of all orders
standard introductory example is the torus embedded
exist and are continuous. For general and not necessarily
is defined by
in upright position in and the height function this em-
*
open sets and , the map
* * *
bedding defines. Formally, is
smooth if for every
*
there that
exists an open set
to its distance from the -
4;
mapping each point
containing and a smooth map
4
plane. For each , we consider the set of points with coincides with throughout . Note that the com-
height less than or equal to ,
* * 4
h ( s)
is called
morphism
r
, and its inverse
h ( r) of
. As
* F *IF
an example we may
a coordinate system on
consider the 2-sphere
h (q)
q
cover with six open hemispheres defined by
. We can
*
p for . As shown in Figure VI.2, each hemisphere
h ( p)
*
can be parametrized by orthogonal projection to one of the
attach attach attach attach coordinate planes. For a point , we can construct
0-cell 1-cell 1-cell 2-cell
0
* HF *KF
It is instructive to look at the evolution of the homotopy
type of
homeomorphic to
. A -cell, , is a space
the -dimensional ball, . Each
time the homotopy type of changes, we can interpret
this event as attaching a cell of some dimension. The evo-
lution of the torus during the sweep and the interpretation Figure VI.2: The upper open hemisphere is parametrized by pro-
of attaching cells is illustrated in Figure VI.1. To define jection to the -plane.
what attaching a cell exactly means, note that the bound-
is a -sphere, . The attachment of
* *
ary of a -dimensional hyperplane in that best approximates
to a space requires a continuous map ,
near . The tangent space at is the
hyperplane / -dimensional
which we refer to as the gluing map. Then with at- through the origin of that is parallel
*
tached by is the space
* /
obtained by identifying to this best approximating hyperplane. The elements of
every points with . All interior points the vector space are called tangent vectors to at
VI.1 Smooth vs. Piecewise Linear 91
thus an element of / .
vector directions along which decreases. For example,
the indices of the critical points , , , and in Figure
VI.1 are 0, 1, 1, and 2. This fact is also expressed in the
lemma of Morse.
4
Critical points. The homotopy type of the partial torus
changes when passes the height value of the points
, , , and marked in Figure VI.1. These are the points
. There is a
M ORSE L EMMA . Let be a non-degenerate critical point
with index of
with horizontal tangent planes. Assuming a local coordi-
nate system in a neighborhood, a point is a critical
neighborhood of and a local coordinate system
in with
for all and
point of if all derivatives vanish,
*
If is a critical point then is a critical value. Non- throughout .
critical points and non-critical values are also referred to
as regular points and regular values. Note that the dimensions of the cells attached to the evolv-
ing torus in Figure VI.1 are equal to the indices of the
Just like the first derivative can be used to compute
corresponding critical points. This is generally the case
the best linear approximation to , the second derivative
because a critical point with index connects
*
can be used to compute the best quadratic approximation.
Specifically, the Hessian of at is the matrix of to the past along directions. These directions span a -
dimensional cell needed to realize the connections.
second derivatives,
#
*
* * * %
Degenerate critical points. A 1-dimensional manifold
is a closed curve. A connected open subset is an open in-
A critical point is non-degenerate if
terval, which is homeomorphic to . Consider the height
singular, that is,
is non-
. Non-degenerate critical
function
defined by
. The * *
derivative vanishes at 0. The second derivative vanishes
points are isolated, which means there is an open neigh-
too,
, which identifies 0 as a degenerate crit-
borhood without other critical points. We call a Morse
ical point. Geometrically, the degeneracy is manifested
function if all critical points are non-degenerate.
by the fact that an arbitrarily small perturbation can re-
A quadratic function in two variables has only three move the critical point or turn it into two non-degenerate
types of critical points, maxima, saddles, and minima. The ones, a maximum and a minimum. Figure VI.3 illustrates
* :* * *
origin is a critical point for every possible assignment of
the instability of the degenerate critical point. A simi-
signs to
, and it is a maximum for
, a saddle for or , and a minimum for .
The saddle is the most interesting case of the three because
a circle drawn around it has two peaks alternating with two
pits. In contrast, a circle drawn around a regular point has
only one peak and one pit. Critical points with small cir-
cles that oscillate more often than twice are necessarily
degenerate.
dimension of the manifold . Assuming the Hessian is The middle function has a degenerate critical point at 0, which is
non-singular, all eigenvalues are non-zero. The index of unfolded in different ways by the other two functions.
at a non-degenerate critical point is the number of neg-
ative eigenvalues and is denoted as . Recall that the lar degenerate critical point exists for the monkey saddle
eigenvectors define an orthogonal coordinate system in the shown in Figure VI.4. It may be specified as the graph
92 VI D ENSITY M APS
* :*
* @* * , which is the real part of
For example for the torus we get
* * . As we go around a circle centered at the ori-
of
. In words, for every minimum and maximum we get
gin, the function
and three pits at
point is
has three peaks at , , and ,
, , and . The only critical
. The matrix of second derivatives at
Morse function we use. For the sphere we get
exactly one (non-degenerate) saddle point, no matter what
. This implies that every Morse function of the
that point is sphere has at least two (non-degenerate) critical points. A
* * minimum example is the ordinary height function, which
*
* has a minimum at the south-pole and a maximum at the
north-pole.
which is zero at 0.
Bibliographic notes. The original development of
Morse theory from its variational background is described
by Morse [3] and by Seifert and Threlfall [4]. Milnor’s
later book [2] emphasizes the topological analysis of man-
ifolds and has since become a standard reference in Morse
theory. Good introductory texts to the related subject of
differential topology are the books by Guillemin and Pol-
lack [1] and by Wallace [6]. A good introduction to lin-
ear algebra including an intuitive discussion of eigenval-
ues and eigenvectors is the book by Strang [5].
* :* * *
but there are others that are not. For example, for
the entire -axis is critical, but none of [5] G. S TRANG . Introduction to Linear Algebra. Wellesley-
its points are isolated. Similarly, if we lay down the torus Cambridge Press, Wellesley, Massachusetts, 1993.
on its side, the height function has a circle of minima and
[6] A. WALLACE . Differential Topology. First Steps. Benjamin,
another circle of maxima.
New York, 1968.
.
VI.2 Morse-Smale Complexes 93
VI.2 Morse-Smale Complexes joint or the same. Every maximal integral line is open at
both ends and thus a map of an open interval or, equiva-
lently, of the real line, . It approaches two
In this section, we introduce the gradient of a Morse func-
tion and use it to construct the -cells whose inductive at-
tion, 1
,
critical points, which we refer to as its origin and destina-
and
. 1 ,
4
tachment reproduces the evolution of the homotopy type
of , for continuously increasing real threshold . It is convenient to consider each critical point as an inte-
gral line by itself so that the collection of integral lines
partitions . The stable manifold of a critical point is
Gradient flow. The gradient of a linear map *
4 * is the vector
4 4
. It is the 4 the union of integral lines with destination and, symmet-
rically, the unstable manifold is the union of integral lines
projection of a normal vector of the graph of and points with origin ,
.
in the direction of the steepest ascent. The same concept
.
*
can also be defined for a Morse function
/ /8 /
Assuming an orthonormal local coordinate system at ,
the gradient of is
, same
as for linear maps. We can define it also without refer-
* * /
ence to a coordinate system. A vector field, , maps
every point to a tangent vector
. The stable manifold of a minimum is the minimum itself.
The gradient is the particular vector field that satisfies The stable manifold of a saddle is an open curve, which is
, for every vector field , where
the union of two integral lines and the saddle itself. In a 2-
manifold , the stable manifold of a maximum is an open
is the directional derivative of along . For example, if
we have a smooth curve
with
1
velocity vector disk, which is the union of a circle of integral lines and
then the derivative of can be computing maximum itself. All three cases are illustrated in Figure
using the gradient as VI.6. Note that the dimension of each stable manifold is
the index of the critical point that defines it,
,
,
.
The gradient vanishes precisely at all critical points of .
If we start at a regular point and follow the gradient we
-* *
trace out a path, which is a solution to the ordinary dif-
ferential equation
. This path is called an
integral line. It depends smoothly on the initial condition,
which is its regular starting points. Two integral lines can
therefore not cross. Neither can an integral line fork, and
because we can reverse the gradient vector field by con-
Figure VI.6: From left to right, that stable manifold of a min-
sidering , two integral lines can also not merge. The imum, a saddle, and a maximum of a two-dimensional Morse
patterns of integral lines in the neighborhoods of a regu- function.
lar and several critical points on a smooth 2-manifold are
shown in Figure VI.5
Each stable manifold is the injective image of an open
balls. However, as indicated by the examples in Figure
VI.6, the closure of a stable manifold is not necessarily
homeomorphic to a closed ball. Nevertheless, the clo-
sure of each stable manifold is the union of (open) sta-
ble manifolds. The collection of stable manifolds thus
satisfies the two conditions of an open complex: its cells
Figure VI.5: From left to right, the flow in the neighborhoods of partition and the boundary of every cell is a union of
a regular point, a minimum, a saddle, and a maximum.
other cells. By symmetry, everything we said about sta-
ble manifolds is also true for unstable manifolds. The
dimension of the unstable manifold of a critical point
is the co-dimension of the stable manifold,
Stable manifolds. Every regular point belongs to an in-
tegral line, and two maximal integral lines are either dis- .
94 VI D ENSITY M APS
Morse-Smale functions. We may refine the complexes Shape of Morse-Smale cells. Note that all 2-cells in
of stable and unstable manifolds by forming unions of Figure VI.7 have four sides, provided we count an arc
integral lines that agree on both limiting critical points. twice if it bounds the cell on both sides. In other words,
This amounts to overlaying the two complexes. In do- all two-dimensional Morse-Smale cells are quadrangles.
ing so, it is convenient to assume that the stable and un-
Q UADRANGLE L EMMA . Every 2-cell of a two-dimen-
*
stable manifolds intersect in a generic manner. To ex-
sional Morse-Smale complex is a quadrangle.
. The intersection is transversal
plain what this means, we consider a point common to
/ /
*
and
span the tangent P ROOF. The vertices of a 2-cell alternate between saddles
/
at if the tangent spaces and
space . Equivalently, the dimension of the intersec-
and other critical points, and the non-saddles alternate be-
/ /
tween minima and maxima. Any such cyclic sequence has
tion of the two tangent spaces is
.
length , for . We take two copies of a -gon and
whose stable and unstable manifolds intersect only remain maxima. The result is a topological 2-sphere with
transversally. For example, the height function of the up- minima and maxima. The Euler characteristic of the
right torus in Figure VI.1 is Morse but not Morse-Smale 2-sphere is , which implies .
because the stable 1-manifold of the upper saddle, , meets
The 3-cells of a Morse-Smale complex may have the
the unstable 1-manifold of the lower saddle, , along en-
structure of a cube, but they can also assume more gen-
tire one-dimensional
integral
lines,
. Morse-Smale
eral shapes with arbitrarily many saddles alternating be-
functions are again dense in the set of maps from to
tween index-1 and index-2 separating the minimum from
the maximum. The common features of all 3-cells are that
. In the case of the upright torus, it suffices to tilt it ever
they have one minimum and one maximum, and all 2-cells
so slightly sideways in order to get transversality. Assum-
in the boundary are quadrangles. A few examples of 3-
ing a Morse-Smale function, we define the Morse-Smale
cells are shown in Figure VI.8.
complex as the collection of connected components of in-
tersections of stable and unstable manifolds. We can see in
Figure VI.7 that it is indeed necessary to take components.
4
Smale function. The two bold 2-cells share the same origin and
*
We need some definitions to explain the linear interpo-
destination.
* 4
lation. Each point of a triangle is a convex com-
bination of the three vertices, with
VI.2 Morse-Smale Complexes 95
the lower stars of a regular point, minimum, saddle, maxi-
Figure VI.9: Portion of a triangulated surface of a molecule.
mum, and -fold saddle are , , , , and . It follows
immediately that is the number of minima and maxima
minus the number of saddles counted with multiplicity.
and . The three parame-
4
all , we sort the vertices in the order of increas-
*
ing height. Indexing the vertices accordingly, we define
4 are , which implies that the is a simplicial complex. The sequence of complexes
and
linearly interpolated agrees with the value specified at 4 .
let by Schey [3]. The transversality condition for stable
and unstable manifolds has its origin in dynamical system
and is named after Steve Smale [4]. The Morse-Smale
complex has been introduced recently in [2] along with
algorithms for piecewise linear height functions over 2-
It is convenient to assume pairwise different height values
manifolds. The idea of writing a triangulated manifold as
at all vertices so that each simplex belongs to exactly one
lower star. With this assumption, the lower stars partition
the complex . Figure VI.10 illustrates the definitions by
the disjoint union of lower stars goes back to Banchoff [1].
showing the lower stars of vertices that behave like regular [1] T. F. BANCHOFF . Critical points and curvature for embed-
ded polyhedra. J. Differential Geometry 1 (1967), 245–256.
points, minima, saddles, and maxima. More complicated
lower stars are possible, and we cannot remove them just [2] H. E DELSBRUNNER , J. H ARER AND A. Z OMORODIAN .
by perturbing the height values. Instead, we may consider Hierarchy of Morse-Smale complexes for piecewise linear
a vertex whose circle of neighbors alternates
Bibliographic notes.
Bibliographic notes.
Exercises
The credit assignment reflects a subjective assessment of
difficulty. Every question can be answered using the ma-
terial presented in this chapter.
1. Section of triangulation. (2 credits). Let be a
triangulation of a set of points in the plane. Let
@
be a line that avoids all point. Prove that intersects
at most edges of and that this upper bound
is tight for every
.
100 VI D ENSITY M APS
Chapter VII
As a general theme in biology, questions are almost tween the two sets. In Section VII.3, we look at the re-
always about populations and rarely about individuals. lated problems of sampling a rigid motion and of covering
This is particularly true on the molecular level. The the space of such motions with small neighborhoods. In
molecules that participate in the mechanism of life tend Section VII.4, we apply the methods to questions of sim-
to be large and composed of small molecules. Minor ilarity and complementarity. In particular, we look at the
variations in the type or arrangement of the components problem of identifying matching subsequences with min-
are frequently inessential and do not alter the role of a imum root mean square distance and at score functions
molecule within the larger organization. But then again, that assess the shape complementarity of two space-filling
there are seemingly small variations that do have signif- diagrams.
icant consequences. The underlying question is one of
definition: when do we call two molecules the same or
of the same type, and how do we quantify and assess that
notion of sameness. There are various approaches to the VII.1 Rigid Motions
question applied to proteins, including the comparison of VII.2 Optimum Motion
amino acid sequences, space curves modeling backbones, VII.3 Sampling and Covering
and shapes formed by space-filling diagrams. Instead of VII.4 Alignment
asking how similar two shapes are, we may also ask the Exercises
related question of how well two shapes fit side by side.
The complementarity question is a similarity question be-
tween one shape and (a portion of) the complement of an-
other shape. It really makes sense only for space-filling
diagrams and does not seem to apply to information ex-
pressed in terms of sequences and space curves. The
similarity question is at the core of human understand-
ing, which crucially relies on classification to simplify and
create order. The complementarity question, on the other
hand, is at the root of natural and other re-production pro-
cesses and it takes part in protein interaction, which forms
the basis of functioning life.
As always in this book, we focus on mathematical and
algorithmic methods that shed light on the broader biolog-
ical issues. In Section VII.1, we explore rigid motions in
three-dimensional Euclidean space and introduce quater-
nions as a tool to specify and compute with rotations. In
Section VII.2, we study the problem of finding the best
rigid motion for matching one points set with another. The
measure of choice is the root mean square distance be-
101
102 VII M ATCH AND F IT
VII.1 Rigid Motions can be obtained by a sequence of three rotations about co-
ordinate axes. In general, the composition of any two ro-
A motion in three-dimensional Euclidean space can be de- tations is another rotation. Indeed, the rotations form the
so-called special orthogonal group of 3-by-3 matrices, ab-
composed into a rotation and a translation. In this section
we consider different ways to mathematically represent ro- breviated as SO . Note, however, that this group is not
tations, and we focus on quaternions, which provide a par- abelian because the multiplication of matrices and there-
ticularly elegant mathematical framework. fore the composition of rotations is not commutative. It is
important to specify the Euler angles in a fixed sequence
as other sequences of the same angles usually specify dif-
ferent rotations. It is mostly true that two different triplets
Rotation and translation. A rigid motion in is an
*
orientation-preserving isometry of three-dimensional Eu- of angles specify different rotations, but there are excep-
tions. Consider for example a rotation by about the -
*
clidean space. More formally, it is a map
FH*
F
and
F * F * axis, followed by a rotation by about the -axis,
*
such that
*
for every pair . As illustrated in * followed by a rotation by about the -axis and note
Figure VII.1, a rotation is a rigid motion that preserves the that we get the same composite rotation if we switch
origin, and a translation is a rigid motion that preserves and . In other words, the map
difference vectors. Every rigid motion can be written as
SO
x1 x2
Quaternions. As an alternative to orthonormal 3-by-3
matrices, we may use quaternions to represent rotations.
Quaternions can be viewed as a generalization of complex
Figure VII.1: The translation of the boldface original coordinate numbers:
system preserves the directions of the axes while the rotation pre-
serves their anchor point.
I J K
*
I , J and K
the composition of a rotation and a translation:
*
.
are three different imaginary units. In preparation of an
Using matrix notion, we can write , where
operation that multiplies two quaternions, we specify how
is an orthonormal 3-by-3 matrix with unit determinant
to multiply the imaginary units:
and is a 3-vector:
* ,
I J K
*
,
I K J
* ,
J K I
K J I
The rotation matrix moves the unit coordinate vectors to
and that make up the columns of
Note that reversing two different
imaginary units changes
the sign of the result. If K is another
the vectors ,
I J
quaternion then the product of and is
. A rotation about a coordinate axis has a comparatively
*
simple rotation matrix. For example, rotating about the
-axis gives
I
2
J
K
The angle of rotation about a coordinate axis is referred to The product has a similar form but six of the terms have
as an Euler angle. Leonhard Euler proved that any rotation their signs changed. Sometimes it is more convenient to
VII.1 Rigid Motions 103
think of a quaternion as a vector in . We can express cannot use simple multiplications to represent rotations
the product of two quaternions in terms of an orthogonal because the product of a unit quaternion and a purely
4-by-4 matrix and a vector. This can be done by expanding
either the first or the second quaternion to a matrix:
imaginary quaternion is not in general purely imaginary.
Instead, we use the composite product . Ob-
serve that
where
and
are the 4-by-4 matrices that correspond
to . We expand the product of the two matrices in Ta-
ble VII.1 and see that is purely imaginary. Furthermore,
since F F
, both and are orthonormal. It follows
that the lower right 3-by-3 submatrix of is also or-
thonormal. This 3-by-3 matrix is the familiar rotation ma-
to
for computing
with
can be rewritten as
F F
As usual, we can use the scalar product to define the length
of a vector:
. Similar to complex num-
unit length, we have , and . uct with the unit quaternion to , which shows that the
composite product preserves cross-products, as required.
Representing rotations. We use
quaternions to represent vectors in
purely imaginary
and compound mul- Axis and angle. The expansion of given in Table
F F
tiplication with unit quaternions to represent rotations. We
start with a few properties, always assuming
VII.1 provides an explicit method for computing the or-
because
. This implies in particular that multi-
I
J
K
F F
Table VII.1: Product of matrices in the representation of a rotation by composite multiplication with unit quaternions.
ux r
To prove the claimed correspondence, we write the vec-
tor rotated by about the axis defined by using the
θ formula of Rodrigues,
2
2
r,u u r’
r
which can be seen from Figure VII.2. We show that this
, where
can also be written in the form
, and as given above. Tedious but
straightforward calculations show
Figure VII.2: The rotation of the vector by an angle of about
the line spanned by . The three dotted vectors correspond to the
If we substitute and
terms in the formula of Rodrigues.
and use the
2
and 2
identities
the direction of the rotation axis, and the real part deter-
mines the angle of the rotation. Note that represents
then we obtain the formula of Rodrigues.
the same rotation as and that non-antipodal pairs of unit
quaternions represent different rotations. In other words,
Composing rotations. The above relationships provide
the unit sphere
in is a double cover of the space
of rotations in . Figure VII.3 illustrates the correspon-
a convenient conversion between unit quaternions and
F F FF
dence with a picture in one lower dimension. The space
axis-angle pairs. We have
and
,
F F
. The composition of two rotations
represented by the unit quaternions and is
x0
Thus, composition of rotations corresponds to multipli-
x1 x2 cation of quaternions, and from the product it is easy to
again get the axis and the angle. A more direct geomet-
ric description of the composition of two rotations uses
the fact that every rotation can be written as the composi-
tion of two reflections, as illustrated in Figure VII.4. The
Figure VII.3: The north- and south-poles correspond to the iden-
two planes defining the reflections are not unique; they
just need to pass through the axis of rotation and enclose
tity, and points on the equator correspond to rotations by .
The dashed great-circle through the two poles represents the set half the angle of rotation. To compose two rotations, we
of rotations about a fixed axis. write each as the composition of two reflections, making
sure that the second plane of the first rotation is also the
obtained by identifying antipodal points of is usually first plane of the second rotation, as in Figure VII.4. The
referred to as the real projective three-dimensional space, middle two reflections cancel and we are left with two re-
or
for short. It is a good model of the set of rotations flections. The axis of the corresponding rotation is the
in , although we usually prefer because it is easier to line common to the two planes, and the angle of rotation
imagine. is twice the angle enclosed by the planes.
VII.1 Rigid Motions 105
ρ
w
ϕ
ψ
u
v
VII.2 Optimum Motion point for which the sum of the vectors to the points in the
collection vanishes:
In this section, we study an optimization problem that
arises when one attempts to match two molecular struc-
tures or to fit two structures snug next to each other. After
formulating the optimization problem, we solve it using
quaternions representing rotations in three-dimensional theFsum
from the . Indeed, *
This implies that minimizes
* F is a quadratic
of square distances
space.
function with a unique minimum. That minimum is char-
acterized by a vanishing gradient:
A *
Problem specification. Suppose we are given two finite
*
collections of points in and a bijection between them.
While entertaining the possibility that the two collections
are structurally the same or at least similar, we are in-
terested in moving one collection so it best matches the As mentioned earlier, the latter sum vanishes iff . *
two collections and assume that corresponds to , for origin of and move the translated copy of with it
each . We use the root mean square or RMS distance to to . This operation is illustrated in Figure VII.5.
assess how similar the two collections are. This measure Then the sum of square distances between the correspond-
is the square root of the average square distance:
F F
F
F
Note that minimizing the root mean square distance is
, is also the sum of square
equivalent to minimizing the sum of square distances. Re- ing points,
call also that every rigid motion can be decomposed into distances of the points from the origin. The
a rotation followed by a translation. The space of rigid translation minimizes the sum iff the origin is the centroid
motions is therefore six-dimensional, namely , of the points :
would be hopeless or at
and it might seem that computing the particular rigid mo-
tion that minimizes
least difficult. Quite the opposite is true, and the main rea-
son for this is the convenience provided by quadratic func-
tions. We consider rotations and translations separately. This implies that the best translation moves to , as
claimed.
Optimum translation. Recall that the centroid of a col-
and of are
lection of points is the average the points. More for- Optimum rotation. Note that rotating and taking the
mally, the centroids of and centroid commute. In other words, the centroid of
. We begin by showing that the best translation is . Since every rigid motion can be
written as a rota-
that
to the
moves to . In other words, the translation tion followed by a translation, , the motion can
* *
minimizes the root mean square distance between be optimal only if translates the centroid of
and is defined by . A crucial insight
used in proving this fact is that the centroid is the only
and independently translating such that
centroid of . We may therefore simplify our problem by
translating
VII.2 Optimum Motion 107
both centroids lie at the origin. Equivalently, we may as-
sume . Using quaternions, we can express the
rotation of a point as , where is a unit quater-
nion and is the pure imaginary quaternion that corre-
sponds to , as explained in Section VII.1. The
sum of the square distances after the rotation is
F F
F F F F
Figure VII.6: The plane represents , the partially dotted circle
represents , the surface represents the graph of the quadratic
function over , the dashed lines represent the zero-set and the
The sums of the F and the are not affected by F F F
boldface curve represents the graph of the restriction of that func-
the rotation, so minimizing is equivalent to maxi-
tion to .
mizing the sum of the
. Since multiplication
with a unit quaternion preserves scalar products, we have
. Recall from the previous sec-
point for which the quadratic function gives a max-
imum. We can compute such a with a modest amount of
tion that
linear algebra.
Recall that the eigenvalues of a square matrix are the
vanishes. The corresponding eigenvectors are the unit vec-
tors such that . Letting , we
have four eigenvalues, and because is symmetric, the
eigenvalues
are all real. It is convenient to order them as
. The corresponding eigenvectors
are
pairwise orthogonal and therefore span . We can thus
*
write any quaternion as a linear combination of the eigen-
vectors,
, and because we are only interested
The two matrices are skew symmetric as well as orthogo- in unit quaternions, we have . Hence *
nal. The sum that we have to maximize can now be rewrit-
ten as
*
*
By the assumed
ordering of the eigenvalues, we have
, and this maximum is attained for . *
The corresponding quaternion is . In other words,
where . Take a moment to verify that each the optimum rotation is defined by the unit eigenvector
matrix in this sum is symmetric. Since the sum of sym-
metric matrices is again symmetric, we have . that corresponds to the largest eigenvalue.
Eigenvalues and -vectors. We can interpret ge- Without bijection. If there is no bijection specified be-
tween the two sets then the problem of finding the best
ometrically as a quadratic function over four-dimensional
rigid motion seems significantly more difficult. Assum-
Euclidean space. Short of being able to draw the graph of ing and contain points each, we could of course
this function in , we illustrate the idea in Figure VII.6, try all bijections, but that would take a long time. A
which drops two of the dimensions. Our goal is to find a more effective algorithm alternates between improving the
108 VII M ATCH AND F IT
root mean square distance by changing the bijection and puter vision, the version that works with injections rather
by changing the motion. Note that independent of the bi- than bijections is known as the iterated closest point or
to the centroid of . So we may again assume that both
jection, the best translation always moves the centroid of ICP algorithm [1].
centroids are at the origin and restrict ourselves to rota- [1] P. J. B ESL AND N. D. M C K AY. A method for registration
tions. We use three subroutines to describe the iterative of 3-D shapes. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-
algorithm. For a given rotation, M ATCH returns the 14 (1992), 239–256.
between
and . Given a permutation, ROTATE
permutation that minimizes the root mean square distance
[2] O. D. FAUGERAS AND M. H EBERT. The representation,
recognition, and locating of 3-D objects. Int. J. Robotics
returns the rotation that minimizes the mean square dis-
Res. 5 (1986), 27–52.
tance under this permutation. Finally, given a permutation
and a rotation, RMSD returns the root mean square [3] B. K. P. H ORN . Closed-form solution of absolute orienta-
distance. tion using unit quaternions. J. Opt. Soc. Amer. A 4 (1987),
629–642.
; identity; [4] W. K ABSCH . A discussion of the solution for the best rota-
loop M ATCH ; ROTATE ;
tion to relate two sets of vectors. Acta Crystallogr. Sect. A
if RMSD then 34 (1978), 827–828.
else exit
endif [5] G. S TRANG . Introduction to Linear Algebra. Wellesley-
forever. Cambridge Press, Wellesley, Massachusetts, 1993.
A popular version of the above algorithm uses injec-
tions from to instead of bijections. Sometimes this
change is motivated by the purpose of the computation, at
other times by the fact that finding the best bijection is not
entirely straightforward. Given
a rotation, we may use a
subroutine A SSOCIATE , which determines for each
the point closest to . In the algorithm, we replace
M ATCH by A SSOCIATE and do the remaining operations
as before, except that is replaced by the multi-set of
points in that are closest to some point in .
2
time on the more complicated case of rotations.
2
The size of a sphere. We prepare the discussion of sam-
pling rotations by measuring the unit 2-sphere and the unit
3-sphere. For embedded in , we sweep a plane nor-
*
mal to the -axis and compute the area by integrating in-
*
finitesimal slices. The perimeter of the circle in which the
plane cuts the sphere is
*
equal to
. Hence, *
, with the square radius which we get by substituting
. The total vol- *
*
# ume of the 3-sphere is therefore
. Note also that
* Archimedes’ theorem does not extend to the 3-sphere, at
* %
Area least not in the straightforward manner from sections be-
*
tween parallel plane to sections between parallel hyper-
planes.
The total area of the 2-sphere is therefore . But note Uniform sampling. Archimedes’ theorem can be used
that the derivation shows more, namely that the area of the to pick a point uniformly at random on . The method
slice between two parallel planes at a constant distance is may be viewed as picking a point on the enclosing cylinder
the same for all such planes, as long as both intersect the and projecting it back to the 2-sphere:
sphere. This fact has been known already to Archimedes
and is often expressed by saying that the axial projection Step 1. Pick uniformly at random in .
from the sphere to an enclosing cylinder preserves area. 4
Step 2. Pick uniformly at random in .
J4
2 2
This projection is illustrated in Figure VII.7.
.
*
Define
Return
.
*
We now extend this method to and thus to an algorithm
for picking a rotation uniformly at random. Think of as
the axis of rotation, so we just need to pick the angle of
rotation about this axis. It would not be correct to pick an
*
angle uniformly at random since this would favor small
dislocations of . Indeed, in the quaternions near the
identity would be more likely than those far away from the
identity. To pick the angle correctly, we return to what we
learned from the above volume computation. The angle of
rotation about the axis is twice the angular distance from
the identity on . In other words, . We
need to pick the angle from , not uniformly but
Figure VII.7: Illustration of Archimedes’ theorem implying that
from a density that favors angles near the middle of the
the sphere and the enclosing truncated cylinder have the same
interval. Specifically, the density is , normalized
area.
to have unit total integral. The corresponding distribution
/
function is
method to compute the volume of
We use the same
embedded in
*
. Sweeping
a three-dimensional hyper-
plane normal to the -axis, we get the volume by integrat-
*
2
110 VII M ATCH AND F IT
*
which monotonically increases and reaches
*
at
. To pick
an angle, we pick a number uniformly
at random in , and we compute its preimage under
the distribution function:
. To get a point
uniformly at random on , we append Steps 1 and 2
above with
Figure VII.8: From left to right: the cube, the FCC and the BCC
lattices. The points with maximum distance to the lattice points
Step 3. Pick uniformly at random in .
are the cube centers, the edge centers and the midpoints between
Let .
2 the face and the edge centers.
2 .
Return
we note that the FCC lattice has four times and the BCC the centroids of the two collections are both at the origin:
lattice has twice as many points as the cube lattice. The . This implies that the vectors add up to
*
*
packing radius is the largest radius we can assign to the 0 implying that the sum of scalar products with any vector
*
F * F obvious that the total distance increases the fastest when
each point moves in a direction straight away from .
This is possible in the limit and characterized by the veloc-
F
F FH*KF
ity vector of being parallel to , which includes
the possibility that . As for translations, the length
which implies *
F H*KF . We have * FH*KF if and
of the gradient is maximized if
for all . In this
only if
case, we have and the root mean square
distance between and the rotated copy of is
F F
square distance changes with varying translation vector,
we compute the the gradient:
A *
* * *
*
*
except at *
are the eigenvalues of the
everywhere
its length is F0A * F
where
if
. The length is 1 if and only matrix defined in the previous section. For the purpose
gradient and its length, we consider a
of computing the
by comparing the graphs obtained for equal and for non- function over :
equal corresponding points. Since the length of the gra-
A * * * *
F2A F
*
A * * *
mean square distances for two translations is bounded
from above by the norm of the difference vector:
*
FH*
F F0A F
In words, the root mean square as a function over the Since the length of the gradient never exceeds
, the
three-dimensional space of translations satisfies a Lips- difference between the root mean square distance for two
chitz condition with constant 1. rotations is no more than that multiple of the norm of the
difference vector:
Sensitivity to small rotations. We repeat the analysis
FH 'F
for rotations. Call the root mean square distance from the
F F
FF now depends on the collection of points, in particular to
their radii of gyration.
and
Let
* :* * :*
be a unit quaternion. The ef- Bibliographic notes. The problem of sampling motions
fect of the rotation represented by is best viewed in the has been studied in various fields, including statistics,
112 VII M ATCH AND F IT
VII.4 Alignment creasing the length. We turn the recurrence relation into
an algorithm:
In this section, we briefly discuss the two problems of
LCS
match and fit for protein structures. We begin by studying integer
:
;
how to match proteins and develop an algorithm that mea-
to do
;
;
for
sures the similarity between two chains of atoms. There-
for to do
after, we consider the related problem of docking a protein
:
if then
with its substrate.
else
endif
Longest common subsequence. Consider first the com- endfor
binatorial (as opposed to geometric) version of the se- endfor; return .
quence alignment problem. We model a protein as a
string over the alphabet of twenty amino acids: This algorithm is a typical example of the dynamic pro-
and . An alignment maps gramming paradigm, which constructs an optimal solution
the to the in sequence, but it permits spaces on both from pre-computed optimal solutions to sub-problems. To
sides. As illustrated in Table VII.3, we represent an align-
ment by a matrix consisting of two rows and store the solutions, the algorithm uses an array of en-
tries. Each entry takes constant time, which implies that
columns, where is the total number of spaces. A match the total running time is proportional to . Using a sec-
ond array of the same size, we may keep track of the deci-
Q R A A C C sions made by the algorithm, and with this extra informa-
A Q A C R C R tion, we can reconstruct the longest common subsequence
itself, and not just compute its length.
Table VII.3: The alignment uses spaces to achieve
matches.
Sequence alignment. The general alignment problem
is a column of two equal non-space characters, and a mis- permits mismatches and assesses the score by rewarding
match is a column with two different non-space charac-
ters. Columns with two spaces are disallowed. An inser-
*
each match and penalizing each mismatch, insertion and
deletion. Assuming gives the score for having *
tion is a column with a space at the top and a deletion is and in a single column, we get
a column with a space at the bottom. The common sub-
sequence between two strings consists of all matches, and
its length is the number of matches. For the moment, we
be the length of the longest common
restrict ourselves to alignments without mismatches. Let-
ting
subsequence,
We can think of every alignment as a directed path in the
is the minimum number
so-called edit graph of the two strings, which we illustrate
of insertions and deletions needed to transform to . in Figure VII.10. The path starts at the source in the upper
We compute by dynamic programming. Let be the
length of the longest common subsequence of
A Q A C R C R
and , and define for all and . Then
Q
:
if R
if
A deletion:
A insertion:
To verify the recurrence relation note that every alignment
ends with an insertion, a deletion or a match. In each case, C match:
removing the last column leaves an optimal alignment of C mismatch:
shorter strings. In the third case, we need to show that the
length of the common subsequence cannot increase if we
Figure VII.10: The edit graph for the strings in the above exam-
do not use the match between and . Indeed, without ple and the path that corresponds to the given alignment.
using that match we end with an insertion or a deletion,
and we may move the last match to the end without de- left corner, takes vertical, horizontal and diagonal edges
114 VII M ATCH AND F IT
. This construction is
the alignment is a sequence of contiguous insertions or illustrated in Figure VII.11. Let be the motion that
and is then
of contiguous deletions. It is common to penalize a gap maximizes . The score of the best alignment between
separately for its existence and an additional amount that , and the best alignment is
for which
4
depends on its length. This may be done by penalizing an .
an amount
4
insertion or deletion an amount when it starts a gap and
when it continues a gap. This gives Γ
rise to the following recurrence relations:
:
4
:
4
an insertion and
is the score of the best alignment that
Figure VII.11: The horizontal axis represents the six-dimension-
ends with a deletion. Using three arrays, we can again al space of rigid motions. The upper envelope of the graphs is
compute the best alignment with dynamic programming the motion-wise maximum of the score functions.
in time proportional to .
The idea of the algorithm is to sample the space of mo-
Chains of atoms. We can use the same algorithmic tions dense enough to guarantee an alignment with a score
at least , for some . We thus aim at computing
ideas to compute alignments between two sequences of
atoms. Let the and the be the centers of the -carbon an approximately best alignment, but we may decrease
and thus get arbitrarily close to the optimum. This strategy
atoms along the backbones of proteins. For now, we
two
assume a fixed embedding in and consider the align- makes sense in practice since in any case the locations of
ment problem without applying any rigid motion. Using atoms are only known up to some precision.
the root mean square distance between two sub-chains is
problematic for two reasons. First, it does not lend it- Running time. Improving the approximation by de-
self to the dynamic programming algorithm and, second, creasing comes with a cost, namely higher running time
it prefers shorter over longer sub-chains. Instead, we need
a score function that balances the contributions of length
because we evaluate for more rigid motions. We quan-
tify the dependence by analyzing the running time depend-
and distance. One such function is obtained by combining
ing on . The other parameters entering the analysis are
square distances with gap penalties as follows. Letting
and , the radii of the small-
the lengths of the chains,
and be positive constants, we reward a match between
and by adding
est spheres enclosing and and the radii of gyration of
.
score between
by a constant , and hence F2A * F
and . Consider the function is bounded
defined as the motion-wise maximum
VII.4 Alignment 115
@
We cover the space of translations with balls of radius notation to lay out the rules for this problem. Let and
. It follows that having a translation that is not
.
be the protein after applying a
represent the protein and the substrate in complexed form,
quite the optimum contributes at most to the error. and let
F F
. By covering
The sensitivity of to small rotations depends on the radii random rigid motion. The input to the reconstruction al-
gorithm consists of and and not knowing the solu-
of gyration, and we get
the space of rotations with balls of radius
.
, we tion means we can not use any information on and on
get again a contribution of at most to the error. . The goal is to find a rigid motion such that
.
and fit well. After is computed, we can test
By assumption on the shape of the protein, the volume
of translations we need to cover is proportional to , and how well we did by comparing with , which can be
and between and . .
done directly or by computing the root mean square dis-
the volume of the rotations is . In each case, we need
a constant times
balls. We cover the space of rigid
tances between and
motions by cross-products of these balls and thus get a We cannot use the root mean square distance to guide
constant times rigid motions. Multiplying this with our reconstruction of the complexed form and thus need
programming algorithm
the running time of the dynamic a score function that assesses how well a motion does
gives a total running time of . This is of course not in generating a good fit. There are many possibilities, and
practical and we need faster alternatives, some of which one is the approximation of the van der Waals potential by
will be mentioned at the end of this section. counting the pairs of spheres at small distance from each
for the van der Waals radii of the spheres in and
other. We think of the and as the centers and write
F
tion is how well a proteins and its substrate fit to each
other. The substrate could be another protein or a small
ligand. We interpret this question as asking how similar F
F
the substrate is to a portion of the complement of the pro-
tein. This question makes sense if we use space-filling where
is a small positive constant. As mentioned in Sec-
representations of the protein and the ligand, but not if tion I.4, the van der Waals force is weakly attractive within
we represent them combinatorially or as chains of points small distances of maybe up to four Angstrom, and it is
in space. This idea is illustrated in Figure VII.12. For strongly repulsive for colliding van der Waals spheres. We
thus define
if
if
Given a rigid motion , we compute by comparing
all pairs of spheres in time proportional to . Improve-
ments of the running time are possible. Experiments show
that this score function is a good indicator of good fit, but
one weakness is its sensitivity to collisions. Actual pro-
teins are flexible and can avoid minor collisions by small
deformations. We may account for this fact by allowing
Figure VII.12: The shaded local complement of the left shape is a few collisions in the definition of , but to get a good
similar to the shaded portion of the right shape.
approximation of the reality, we will need to build knowl-
edge about flexibility into the score function.
protein-protein interactions, the region of local comple-
mentarity is frequently fairly large. The geometric fit be-
tween the two proteins thus becomes a significant factor Analysis. The general algorithm for re-docking is sim-
in making the interaction possible or, more accurately, in ilar to the one for geometric alignment: we explore the
not making that interaction impossible. Instead of pro- space of rigid motions and evaluate the score function at
tein docking, we consider the simpler re-docking problem. the centers of the balls used to cover the space. By choos-
Here we are given the complexed form of a protein and
ing the balls in the cover small enough, we can guarantee
.
its substrate and we attempt to reconstruct that form while that the root mean square distances between and
and between
suppressing any knowledge of the solution. We need some and are less than some
116 VII M ATCH AND F IT
threshold . Note that this does not necessarily im- gested in [9]. It should be mentioned that the presented
ply that is large. Indeed, it could be zero because algorithm is significantly slower than the currently most
motions with high score value tend to be right next to mo- commonly used DALI software [5], but it is the only algo-
tions that generate collisions. In other words, whether or
not the algorithm recognizes as close to depends . rithm that guarantees a good approximation of the optimal
alignment in polynomial time.
on the shape of in this neighborhood. We can design The goal of protein docking is the prediction of whether,
cases in which has arbitrarily narrow high spikes and where and how proteins interact with each other and with
our algorithm has little chance to ever recover the com-
other molecules. In many cases, the surface area of the
plexed form. There is, however, experimental evidence interface during the interaction is substantial, and in these
that such configurations do either not exist or are rare for
cases the geometric fit is an important factor. However,
actual proteins.
there are cases with smaller interaction area in which
Let us return to the question how to cover the space of forces unrelated to geometric shape outweigh the impor-
motions to guarantee a root mean square distance of at tance of shape [2]. We refer to [4] for a recent survey
most . As before, we simplify the analysis by setting of the extensive literature on computational approaches to
and assuming that the radii of the smallest enclos- protein docking. The material in is this section is based on
ing spheres and the radii of gyration are all roughly equal the work described in [1].
to . According to the sensitivity analysis in the previ-
ous section, we may cover the space of translations with [1] S. B ESPAMYATNIKH , V. C HOI , H. E DELSBRUNNER AND
balls of radius and the space of rotations with balls of
radius , where is the radius of gyration of either
J. RUDOLPH . Protein docking by exhaustive search. Manu-
or . For the translations, we need to cover a volume script, Duke Univ., Durham, North Carolina, 2003.
of about requiring about
balls. For the rotations,
[2] A. H. E LCOCK , D. S EPT AND J. A. M C C AMMON . Com-
puter simulation of protein-protein interactions. J. Phys.
we need to cover a constant volume also requiring about
Chem. B 105 (2001), 1504–1518.
balls. The total number of rigid motions to be ex-
plored is thus proportional to , and multiplying this [3] D. G USFIELD . Algorithms on Strings, Trees, and Se-
with quadratic running time for evaluating the score func- quences. Cambridge Univ. Press, England, 1997.
tion , we get a total running time proportional to .
An improvement by a factor is possible if we compute
[4] I. H ALPERIN , B. M AO , H. W OLFSON AND R. N USSINOV.
Principles of docking: an overview of search algorithms and
for all translations composed with a single rotation in one a guide to scoring functions. Proteins 47 (2002), 409–443.
constant , this improves the running time to
sweep. For
roughly . Since is typically in the thousands, even this [5] L. H OLM AND C. S ANDER . Protein structure comparison
is not practical and we need faster alternatives. by alignment of distance matrices. J. Mol. Biol. 233 (1993),
123–138.
*
1. Reflections. The reflection through a plane maps (i) Show that the minimum of two numbers picked
*
every point to the point such that
* *
by Function U NIFORM is distributed according
*
crosses the line segment orthogonally at its mid- to the triangle density function .
J*
point. The central reflection maps every point to its
(ii) How are the minimum, the median and the
antipodal point .
maximum of three numbers picked by Function
(i) Show that every rigid motion is the composition U NIFORM distributed?
of two plane reflections.
(ii) How many plane reflections do you need to rep-
6. Sampling the 3-sphere. Prove that the following
method picks a point uniformly at random on :
resent the central reflection?
2. Sizes of spheres. The -dimensional unit
(i) Pick numbers
*
and
uniformly at ran-
*
dom in .
sphere consists of all points at unit distance from the
(ii) If or
*
then repeat Step
origin of the -dimensional Euclidean space:
* FH*KF
1, else let
*
return
.
and
from a point ,
FH*
F
3. Square distance from planes. The square distance
* * that the uniform density of quater-
2-sphere. Prove
*
nions over implies the uniform density of points
, is also the sum of square distances from over the 2-sphere.
the three planes parallel to the coordinate planes that
pass through . 8. Number of alignments. Recall that an alignment be-
tween two chains of and -carbon atoms that
uses spaces can be represented by a matrix with
(i) Show that the above claim holds for any three
columns. Assuming
planes that pass through and pairwise enclose
a right angle.
two rows and
, we define
and note that we
need insertions just to make up for the difference
FH* F
(ii) Area there triplets of planes enclosing non-right
is equal to the sum in length. The remaining spaces are distributed over
*
angles for which
of square distances from to the three planes? equally many insertions and deletions, so we define
.
4. Sum of square distances. Consider a collection of
points in and let be its centroid. (i) Show that is a necessary and suffi-
(i) Prove that for every point in space, the root * cient condition for the number of spaces in any
alignment of the two chains.
mean square distance to the is the root of the (ii) What is the number of different alignments with
square distance to the centroid plus a constant:
F* $ F
a fixed number of spaces?
*
(iii) What is the total number of different align-
ments?
What exactly is the constant?
(ii) Extend the construction to a collection of
planes in . In other words, prove that there
are three planes for which a similar formula
gives the sum of square distances to the
planes.
(iii) Further extend the construction to a collection
of lines in .
118 VII M ATCH AND F IT
Chapter VIII
Deformation
119
120 VIII D EFORMATION
[Dynamic Delaunay triangulations [3]. Linear motion in
instead of .]
[Predict collisions of spheres.]
Bibliographic notes.
VIII.3 Rigidity
[Discuss the pebble algorithm that analyzes the rigidity of
a graph in three dimensions.]
Bibliographic notes.
VIII.4 Shape Space 123
VIII.4 Shape Space used to mix
Figure VIII.1: Ten snapshots of a deformation with skin and dual complex displayed. The skin in the fifth snapshot is the same as in the
figures above.
VIII.4 Shape Space 125
Figure VIII.2: From left to right and top to bottom: the shapes at times . The sequence is defined by a set of seven
spheres forming a question mark at time and a set of eight spheres forming a human-like figure at time .
126 VIII D EFORMATION
Exercises
The credit assignment reflects a subjective assessment of
difficulty. Every question can be answered using the ma-
terial presented in this chapter.
1. Section of triangulation. (2 credits). Let be a
triangulation of a set of points in the plane. Let
@
be a line that avoids all point. Prove that intersects
at most edges of and that this upper bound
is tight for every
.
Chapter IX
Measures
with
Our general approach to measuring the size begins
indicator functions for convex polyhedra in . From
these we will derive short inclusion-exclusion formulas for
size measurements.
127
128 IX M EASURES
IX.1 Indicator Functions Below we will construct indicator functions of from Eu-
ler characteristics of subcomplexes of the boundary com-
The Euler relation for convex polyhedra is a special case plex. The Euler relation will follow from elementary
of the Euler-Poincar´e theorem for complexes. There are proofs of properties of these indicator functions.
elementary proofs for this special case, and this section
Inclusion-exclusion. Let
be the finite collection of
presents one that is inductive.
. For a subset
half-spaces such that
and a point we define
*
Convex polyhedra. A convex polyhedron is the inter-
section of finitely many closed half-spaces. It is either
bounded or unbounded, and both cases are illustrated in
if
Figure IX.1. In the first case, the polyhedron is the convex otherwise.
hull of finitely many points, and in the second, it extends
Note that is outside iff
to infinity. We study polyhedra in -dimensional space, for at least one non-
keeping in mind that
sion since polyhedra in
is the most important dimen-
relate to molecules in
, as
zero subset . Namely if
the outside and we have
then it sees a facet from
for the singleton
we will see later. set containing the half-space whose bounding hyperplane
contains that facet.
We form an alternating sum of the that leads to
an indicator function for the convex polyhedron. The
straightforward way of doing this is called the principle
of inclusion-exclusion. Particularly, we define
.
(IX.1)
Figure IX.1: A bounded convex polyhedron in to the left and
an unbounded one to the right.
#
of subsets of . This sum is
faces of various dimensions, which are usually prefixed
for clarity. For example,
is a -face of itself and the
facets are the -faces. Let
be the number of -
%
faces. The Euler characteristic of is the alternating sum
of faces,
provided
if
is bounded
if is unbounded
IX.1 Indicator Functions 129
For we get
, which we consider an im-
ones crossing the hyperplane shared by and , and the
proper face but still a face of . It is convenient to assume
, where
ones contained in . The corresponding systems form the
- -
general position, which in this context means that there partition
are no two subsets of
that define the same face.
Let
be the system
of subsets that define non-empty faces. For sets
there is an intuitive interpretation of . Consider
and
Note that
. The faces of are defined by sets in
. Notice that according to this definition,
the faces on
the silhouette are not visible. Then iff is , , , and the faces of are defined by sets in ,
visible from . The restriction of the inclusion-exclusion
, , where
formula (IX.1) to the system is
.
(IX.2)
We claim that even though
is much shorter than
, it The introduced systems partition , , and . We can
therefore write their values as sums of values of the
is still an indicator function of . This claim is sufficiently
subsystems,
important to warrant a complete proof.
P IE T HEOREM A.
if
if
and hence . We
argue that all three terms on the right side of the equation
P ROOF. We use induction over the cardinality of the set
for vanish. Both
and
have one less half-space
, which is again defined as the collection of half-spaces
not containing than
does. The induction hypothesis
that do not contain . The basis of the induc-
tion is covered by
, in which case and
thus applies. By assumption,
iff
, which implies that
and therefore
.
, as required. Assume , let
The second term vanishes because all sets in con-
, and define as the closed complement of , which
iff
tain . The third term vanishes because
. We have
is a half-space that contains . Define sets of half-spaces
and
. The correspond-
and
. The
for all
ing systems are
convex polyhedron
is obtained by remov-
. Therefore
cancel pairwise.
because the values
ing the constraint , and therefore
, as shown in Figure IX.2. We distinguish
, where
Unbounded convex polyhedra. The Pie Theorem A
implies the Euler relation for unbounded polyhedra. To
_ see this, we fix a point outside all half-spaces in
, as in
g
Figure IX.3, and rewrite the formula in the Pie Theorem
P ’’ g
P
y
Figure IX.2: The half-spaces and share the hyperplane and Figure IX.3: The point lies in the intersection of the comple-
are complementary to each other. The union of and is . ments of the half-spaces.
D
position, is the number of sets with
cardinality Bounded convex polyhedra. We return to the compu-
. By the choice of , we have
and therefore
for all
tation of the Euler characteristic, this time for a bounded
convex polyhedron . We choose a line not parallel to any
polyhedra,
. This implies the Euler relation for unbounded convex
.
face of and points and sufficiently far in opposite di-
rections on the line. As illustrated in Figure IX.5, this par-
fine
let be the corresponding
and
sum of values. We show
Figure IX.5: The boundary of
is dotted, that of
is solid,
that for points
, is an indicator function for .
and the silhouette is indicated by the two hollow vertices.
if
the silhouette as seen in a view parallel to the chosen line.
Let be the number of -faces of that have non-
empty intersection with the interior of , and define
P ROOF. We construct
that contains
and
a convex polyhedron
approximates
in the sense that symmetrically. Let be the number of -faces in the sil-
, as in Figure IX.4. Define
houette. The projection of the silhouette onto a hyperplane
normal to the line is a bounded convex polyhedron
of dimension . We can now argue inductively that the
Euler characteristic of is
.
P
For , is a closed interval with
establishes the induction basis. For
we have
, which
A
PA
Observe that this sum counts the -face the same num-
Figure IX.4: Three edges and one vertex of intersects the in-
terior of , and the same edges and vertex intersect the interior
ber of times on both sides. On the right side it is counted
times, same as on the
of .
left side. We get
and use the Pie Theorem A to get
if
if
by the Pie Theorem B, using the respective other convex
By choice of
, every point
polyhedron as the restricting convex body . Further-
is contained in all
half-spaces of
. Hence
if
.
more,
The system contains exactly all sets
. Hence for
for which
all points
and therefore also for all points .
IX.1 Indicator Functions 131
the , , and implies
by induction hypothesis. Adding
the alternating
sums of
, as re-
quired.
set. We can therefore compute its volume by integration.
Consider for example a bounded
and a convex polyhedron
convex body
. Let
Figure IX.6: A pyramid cut out of a ball by three half-spaces.
=
Stereographic projection. We now turn to the problem
. of measuring the union of a finite set of balls in .
We transform
the question into one about half-spaces in
.
. Let
with the hyperplane
*
be the unit 3-sphere with center at the origin
. Call
and identify
maps a point to the *
the north-pole of . The stereographic
. projection
* *
point
collinear with and . The map is bijective
and therefore has an inverse. If applied to all points of a
ball in , we get a cap of , which is the intersection
where is the closed complement of the half-space . As-
suming general position, the sets contain or fewer of the 3-sphere with a half-space . This is illustrated
half-spaces each. For measuring molecules, we are mostly in Figure IX.7. The half-space lies on the side of its
interested in the case , in which the volume is a sum
of terms each involving four or fewer half-spaces. N
. Let be the 4-ball bounded by
and
the system of subsets of
that appears in the
complex of and do inclusion-exclusion with a term for
every simplex in the dual complex. This is illustrated in
Pie Theorem B. The volume of the portion of outside Figure IX.8.
the polyhedron is
.
.
.
. =
Figure IX.8: The area of the union is the sum of eight disk areas
minus the sum of nine pairwise intersection areas, plus the sum
We could now get a formula for by scaling the vol- of two triple-wise intersection areas.
ume by the distortion factor of . A more straightforward
translates the
derivation of a formula for the ball union
inclusion-exclusion formula from to . Instead of the Area and length. Similar to volume, we get a Pie Area
system of half-spaces we now use a system of balls ob-
Formula for the surface area of ,
tained by substituting for . For convenience, we use
.
the same notation, namely for the system of balls and
for a generic set in .
. =
finite set of balls is tribution to the area. To prove this formula, we add the
contributions of individual spheres. For a single sphere,
we use the Pie Volume Formula on the set of caps defined
by intersecting balls. Since the caps are two-dimensional,
the volume formula becomes an area formula. Letting
Dual complex, revisited. We observe that the index sys- be the sphere and the set of caps, the area of
.
tem in the Pie Volume Formula is an abstraction of the
dual complex
of . Instead of proving this alge-
is the area of minus the alternating sum of the areas of
cap intersections,
, where
braically, we explain the connection in geometric pictures.
is the abstraction of the dual complex of . For each
set of caps in the system , we have the corresponding
E
Start with and embedded in as suggested in set of balls together with the ball of in the system of .
Figure IX.7. For each ball we get a half-space , By summing over all balls, we get the Pie Area Formula
hedron
and the intersection of the half-spaces is a convex poly-
, which contains the north-pole in its
given above.
interior. Use to project the boundary complex of to Similarly, we can get a Pie Length Formula that mea-
. This is the weighted Voronoi
belongs
diagram
of . A subset sures the total length of the circular arcs in the boundary
of the union of balls,
.
to iff its correspond-
ing
face of has non-empty intersection with the ball
bounded by . But this is also the condition for the
projection of to have non-empty intersection with the
in
interior of . Hence, a non-empty set of half-spaces is
iff the corresponding set of balls defines a simplex
The sets
cause
with one or no half-space are redundant be-
in these cases. The proof of the for-
in the dual complex. We have arrived at a simple inter- mula is similar to the one for area, except that the sum-
pretation of the Pie Volume Formula: construct the dual mation is done over all circles that are intersections of two
134 IX M EASURES
spheres forming a pair in . For each such circle, we ap-
ply the (one-dimensional) Pie Volume Formula and thus
get an expression whose terms correspond to the simplices
in the star of the pair.
We might even go one step further and consider the
number of vertices of . The inclusion-exclusion for-
mula suggests that this number is the alternating sum of
each triple in
vertex numbers of common intersections of balls. For
we have a three-sided spindle with two
vertices, and for each quadruple we have a rounded tetra-
hedron with four vertices. For two or fewer balls we have
no vertices. It follows that in the generic case, the number
of vertices of is twice the number of triangles minus
four times the number of tetrahedra in the dual complex.
exclusion formulas that express the volume, surface area,
and arc length of a union of balls in . The new collec-
three circles decomposing into eight regions in the way
shown in Figure IX.10. Let , , and be the angles at the
tion leads to formulas for voids, which are bounded com-
ponents of the space outside the union.
c
used in any dimension . For example, the 0-sphere is a
4 4 I4 4 4 4
pair of point with possible subsets the empty set, a single
vertices , , and . The left drawing suggests that the area
point, or both points. The only zero-dimensional angles
4 4 4
of the triangle is ,
are therefore 0, , and 1, and we will see shortly that this
4
where we write for the area of the disk with center ,
convention makes perfect sense when we compute volume
for the area of the intersection of the disks with centers
I4 4 4 4
using angles.
and , and so on. If we change the meaning from area to
Consider for example a tetrahedron . For each face perimeter we get .
, we define the angle as the fraction of directions Both formulas hold whenever the three disks are indepen-
around along which we enter . Equivalently, is the dent, but the right drawing in Figure IX.10 indicates that
volume fraction of a sufficiently small ball centered at an there are cases where the formulas are not as obvious as to
interior point of that lies inside the tetrahedron. Figure the left.
IX.9 illustrates the definition. In we refer to the two-
We generalize the formulas for independent triangles to
independent tetrahedra. To simplify the notation, we drop
the distinction between abstract and geometric simplices.
Specifically, we let denote an independent set of four
balls and, at the same time, the tetrahedron spanned by the
four ball centers. We use similar conventions for triangles,
edges, and vertices.
I NDEPENDENT VOLUME F ORMULA . The volume of an
independent tetrahedron is
. =
Figure IX.9: The solid angle at a vertex, the dihedral angle at an
edge, and the zero-dimensional angle of a triangle.
The proof of the formula is somewhat technical and
dimensional angle at a vertex as a solid angle, and the omitted. Similar to the two-dimensional case, we get
one-dimensional angle at an edge as a dihedral angle. sums that evaluate to zero if we replace volume by area
The zero-dimensional angle of a triangle is always . For or length,
convenience, we also define the angles of the improper
faces of as and .
.
.
Independent triangles and tetrahedra. Recall that a
collection of three disks in is independent if for ev-
136 IX M EASURES
Angle weights. We derive a new volume formula for a the same formulas for area and length, except that the first
union of balls by combining the Pie Volume and the In- sum vanishes:
dependent Volume Formulas. We first make the Pie Vol-
.
ume Formula more complicated and then simplify by can-
.
celling terms. It is convenient to cover the portion of
outside the Delaunay triangulation with tetrahedra. This
can be done by adding four points viewed as degenerate
balls to the set . We start with the Pie Volume Formula,
=
.
=
Voids. As defined earlier, a void of a union of balls is a
bounded component of the complement space, .
Figure IX.11 illustrates the fact that every void of is
and decompose into the parts defined by the tetrahedra contained in a void of . From a point inside the void,
that contain as a face,
=
We need some notation to continue. Let denote the set
of tetrahedra in a simplicial complex . Furthermore, for
a subcomplex , let denote the collection of
pairs with and . With this notation
we can rewrite the Pie Volume Formula as
=
. =
where is the Delaunay triangulation of . For example
=
for a tetrahedron , the only coface in is , Figure IX.11: Both voids in the union of disks is contained in a
the angle is , and the contributed term is , corresponding void of the dual complex.
as before. For triangles, edges, and vertices , the contri-
bution is split up into as many pieces as there are angles
around . Whenever is a tetrahedron in , we use the the union of balls looks a lot like from a point outside all
balls and voids. It is therefore not surprising that we can
Independent Volume Formula to make a substitution. This
rewrite the Angle-weighted Pie Volume Formula to get an
results in the new volume formula. We write for . expression for the volume of a void of . The cor-
responding void in is triangulated by a subset of the
A NGLE - WEIGHTED P IE VOLUME F ORMULA . The
vol- Delaunay triangulation. Strictly speaking, is not a tri-
ume of the union of a finite set of balls in is
angulation because it is not even a complex, missing the
simplices that bound the void in . The most straightfor-
.
ward translation of the angle-weighted formula suggests
we compute the volume of by first computing the vol-
=
ume of the corresponding void in and then subtracting
VOID VOLUME F ORMULA . The volume of a void of
The new formula suggests we compute volume in two
with dual set is
steps. First we compute the volume of the underlying
=
space of itself, and second we add the volume of the
fringe, . Observe that not all pieces con-
.
sidered in the second sum are subsets of the fringe; some =
might reach into the interior of . Nevertheless, the
second sum is exactly the volume of the fringe. We get
IX.3 Void Formulas 137
and the total arc
Similarly, we get formulas for the area
to radius
. The first complex is the sequence is
length of by substituting for in the corresponding
and the last is , hence
as required by (ii).
Define
formulas of : and note that the underlying space of
.
is the void in that corresponds to the void in .
are contained in and
By choice of , the balls in
.
thus cannot contribute to the union of balls in any other
way than covering , as required by (iii).
Bibliographic notes. The material of this section is
Proof of void volume formula. The main idea in the taken from [1], which also contains a proof of the -
proof is to cover the void with small balls and measure the dimensional version of the Independent Volume Formula.
The implementation of the formulas are part of the Alpha
difference between the new and the old union. Let be
the set of balls we add, and consider
, , Shapes software and their use in structural biology has
been described in [2]. The Angle-weighted Pie Volume
and . We require that
Formula is related to Gram’s angle sum formula, which
states that the alternating sum of angles in a bounded con-
(i)
be finite,
vex polyhedron always vanishes,
(ii)
(iii)
be a subcomplex of
.
,
.
faces
= =
Assuming these three conditions, we have
In , this implies that the sum of angles at the vertices of
. The Angle-weighted Pie Volume For-
mulas for the two unions are
=
a convex -gon is , for the edges, minus 1, for the -
gon. Expressed in radians, this is
.
In , the sum of angles at the vertices is not longer deter-
.
=
mined by the combinatorial structure of the polyhedron,
but the sum of solid angles minus the sum of dihedral an-
=
gles is. A treatment of Gram’s angle sum formulas can be
=
found in Grünbaum [3, chapter 14].
. =
[1] H. E DELSBRUNNER . The union of balls and its dual shape.
Discrete Comput. Geom. 13 (1995), 415–440.
[2] H. E DELSBRUNNER , M. A. FACELLO , P. F U AND J.
The difference gives the Void Volume Formula. L IANG . Measuring proteins and voids in proteins. In “Proc.
28th Ann. Hawaii Internat. Conf. System Sciences, 1995”,
Finally, we construct so that (i), (ii), and (iii) are vol. V: Biotechnology Computing, 256–264.
satisfied. Assuming general position, there exists a posi-
tive with , where is obtained from
by reducing every ball with radius to radius .
[3] B. G R ÜNBAUM . Convex Polytopes. Wiley, Interscience,
London, England, 1967.
and have the same Voronoi diagrams and Delaunay
triangulations by the way we changed the radii, and they
have the same dual complexes by the choice of . Let
be a finite set of balls of radii with centers in the void
that covers . Let be the set of centers
and note that
the dual complex of is just together with
finitely many isolated vertices. Hence,
where the second containment follows because is
obtained from by growing every ball of radius
138 IX M EASURES
;
.
for to do
;
endfor.
Figure IX.12: There are eight voids in the -complex of cdk2, The implementation of the Area and Length Formulas is
for
Å. Some of the voids have (open) dual sets similarly straightforward. The Angle-weighted Pie and
that seem connected in the image but are not because of missing Void Volume Formulas use the masterlist and in addition
triangles. require a representation of the voids. We use a partition of
we get as
from alvis, we pick the middle of ious voids,
-
the Delaunay tetrahedra into
-
-
the dual complex and the var-
, where
is-
IX.4 Measuring Software 139
the set of tetrahedra in the unbounded component of the
complement of . We have voids, each represented by a space-filling diagram
vol
Vsf
area
Asf
lgth
Lsf
crns
Csf
linear list of tetrahedra. We compute the lists by main-
voids
outside fringe
Vtv
Vof
Atv
Aof
Ltv
Lof
Ctv
Cof
taining a union-find data structure while scanning the mas- envelope Ve Ae Le Ce
terlist from back to front.
dual complex Vsh
case . A DD ; Table IX.1: Cumulative measurements made by the Volbl soft-
case . let be the first and the second ware.
Delaunay tetrahedron that has as a face;
U NION F IND F IND
endfor.
one list. We fix this problem by adding a dummy tetra-
hedron to the system and setting whenever is
a triangle on the boundary of the Delaunay triangulation.
The following pseudo-code is a direct implementation of
the Void Volume Formula of Section IX.3.
;
forall tetrahedra
;
do
forall faces
do
if ; then
.
Figure IX.13: The dual complex of the van der Waals diagram of
endif
endfor Asf = 3.100959e+04 Aof = 3.100959e+04
endfor. Lsf = 1.915391e+04 Lof = 1.915391e+04
Csf = 6388 Cof = 6388
The implementation of the Void Area and Length Formu-
las is similarly straightforward.
Note that the volume of the space-filling diagram is in-
significantly higher than that of the outside fringe. The
Options. The software computes the volume, area, difference is the volume of the dual complex, which is ap-
length, and also the number of vertices in the boundary, parently rather small. The surface area, total arc length,
which we refer to as corners. It does this for the space- and number of corners are of course the same for both.
filling diagram
, its voids, the outside fringe (defined The software also checks a few linear relations that should
plement of
as the portion of the unbounded component of the com-
that is covered by the balls), and the enve-
vanish provided the computations are correct. For exam-
ple, the sum of volumes of the space-filling diagram and
lope (defined as the space-filling diagram union all voids). its voids should be equal to the volume of the envelope,
Table IX.1 lists the main measurements made. As an ex- which in turn should be equal to the sum of volumes of
ample consider the van der Waals diagram of cdk2, whose the dual complex, the voids in the dual complex, and the
dual complex is shown in Figure IX.13. In the checking outside fringe. The specific relations checked by the soft-
option, the software computes all terms in Table IX.1 and ware are
prints a summary of the results. In the considered exam-
ple, it reports that there are no voids and it prints the sizes Vsf + Vtv - Vtiv - Vsh - Vof = 0.0
of the space-filling diagram and the outside fringe as Asf - Atv - Aof = 0.0
Lsf - Ltv - Lof = 0.0
Vsf = 3.034036e+04 Vof = 2.962563e+04 Csf - Ctv - Cof = 0
140 IX M EASURES
contribution to the space-filling diagram with the sum of are placed slightly outside the circles so that the areas of
the -gons are exactly the areas of the caps. Let and
contributions to the voids and the outside fringe. It also
be the angles in the two -gons. Assuming that and
checks whether the sum of contributions really add up to
the total area, and it does this for the space-filling diagram, are rational, we can find infinitely many integers so that
the voids, and the outside fringe. the two -gons share two vertices near the vertices of the
bigon. We then have
. The angles at the
two shared vertices approach as goes to infinity. Fur-
Area formula. All analytic formulas needed to measure thermore, the -gon has
vertices with angle
the common intersection of up to four balls are straightfor-
ward, except possibly the area of the intersection of up to
and
vertices with angle . To compute we re-
call that the area of the cap is
. By construction,
three caps. A formula for the area follows from the Gauss-
the area of the approximating -gon is the same,
namely
Bonnet theorem in differential geometry, but we prefer to
. Hence
,
derive it with elementary means. The cap on a sphere
consists of the portion inside the sphere . Equivalently, and symmetrically
. We plug the values
* * *
the distance between the two planes that cut from ,
, as illustrated in Figure IX.14.
where
The area of the cap is then times the area of the
sphere , which is
.
area of the intersection of two or three caps since the ap-
proximating spherical -gon is only a tool in the proof
and not used in the formula.
Exercises
The credit assignment reflects a subjective assessment of
difficulty. Every question can be answered using the ma-
terial presented in this chapter.
1. Section of triangulation. (2 credits). Let be a
triangulation of a set of points in the plane. Let
@
be a line that avoids all point. Prove that intersects
at most edges of and that this upper bound
is tight for every
.
Chapter X
Derivatives
143
144 X D ERIVATIVES
Exercises
The credit assignment reflects a subjective assessment of
difficulty. Every question can be answered using the ma-
terial presented in this chapter.
1. Section of triangulation. (2 credits). Let be a
triangulation of a set of points in the plane. Let
@
be a line that avoids all point. Prove that intersects
at most edges of and that this upper bound
is tight for every
.
S UBJECT I NDEX 149
Sasisekharan, V., 8
Schütte, K., 113
Scheraga, H. A., 109
Schey, H. M., 66
Schikore, D. R., 77
Schl¨afli, L., 99
Schneider, R., 117
Schulten, K., 38
Seidel, R., 22
Seifert, H., 46, 62
Shah, N. R., 38
Sharir, M., 113
Sherwood, E. R., 4
Shindyalov, I. N., 26
Smale, S., 66
Steenrod, N., 54
Stern, C., 4
Storjohann, A., 58
Strang, G., 62
Stryer, L., 8
Sturmfels, B., 19
Sullivan, J., 34, 38
Taylor, R., 11
Theodorou, D. N., 109
Threlfall, W., 46, 62
Thurston, W. P., 102
Tirado-Rives, J., 11
Tsai, J., 11