1253325352012T01BioGeometry 1

INTRODUCTION
TO BIO-GEOMETRY
Herbert Edelsbrunner
Departments of Computer Science and Mathematics
Duke University
Table of Contents
P ROLOGUE i
I B IO - MOLECULES 1
II G EOMETRIC M ODELS 17
III S URFACE M ESHING 35
IV C ONNECTIVITY 53
V S HAPE F EATURES 71
VI D ENSITY M APS 89
VII M ATCH AND F IT 101
VIII D EFORMATION 117
IX M EASURES 125
X D ERIVATIVES 141
S UBJECT I NDEX 147

AUTHOR I NDEX 149
Preface
[Mention the pioneers who early on recognized the im- and on “Bio-geometric Modeling” in the Spring of 2001 and
portance of geometry in structural molecular biology: Fred the Fall of 2002, all at Duke University. These courses were
Richards, Michael Levitt, Michael Connolly] either taken for credit or audited at least occasionally by
Luis von Ahn, Tammy Bailey, Yih-En (Andrew) Ban, Robert
[Mention that my book on the “Geometry and Topology
Bryant, Ho-Lun Cheng, Vicky Choi, Anne Collins, Abhijit
for Mesh Generation” is complementary/a prerequisite to
Guria, Tingting Jiang, Looren Looger, Ajith Mascarenhas,
this book. In particular, it covers the construction of Delau-
Gopi Meenakshisundaram, Nabil Mustafa, Vijay Natarajan,
nay triangulations in detail, and it describes the simulation
Xiuwen Ouyang, Anindya Patthak, Ken Roberts, Apratim
of simplicity as a general idea to deal with non-generic sit-
Roy, Scott Schmidler, Xiaobai Sun, Yusu Wang, Shumin
uations.]
Wu, Alper Üngör, Peng Yin and Afra Zomorodian.]
[This book is really about alpha shapes in a broad sense.
It might be useful to describe the history of that research in
short. Herbert Edelsbrunner
Durham, North Carolina, 2002
1981. Vancouver. Conception of idea with Kirkpatrick and
Seidel.
1985-89. Graz and Urbana. SoS, Delaunay software, Al-
pha Shape software with Ernst Mücke, Harald Rosen-
berger, and Patrick Moran.
1990-93. Urbana and Berlin. Surface triangulations, Betti
numbers, inclusion-exclusion, CAVE with Ping Fu,
Ernst Mücke, Cecil Delfinado, Nataraj Akkiraju, and
Jiang Qian.
1994-95. Hong Kong. Morphing, molecular skin, with Ping
Fu, Siu-Wing Cheng, Ka-Po Lam, and Ho-Lun Cheng.
1995-98. Urbana. Flow and pockets, skin surfaces with Ho-
Lun Cheng, Tamal Dey, Michael Facello, Jie Liang,
Shankar Subramaniam, Claire Woodworth.
1999-2001. Duke. Skin triangulation, hierarchy, Morse
complexes with Ho-Lun Cheng, Alper Üngör, Afra
Zomorodian, David Letscher, John Harer, Vijay
Natarajan.
2002-2003. Duke and Livermore. Docking, Reeb graphs,
Jacobian manifolds with Johannes Rudolph, Sergei
Bespamyatnikh, Vicky Choi, John Harer, Valerio Pas-
cucci, Vijay Natarajan, Ajith Mascarenhas.
2000-2005. ITR Project. Derivatives, interfaces, software
with Robert Bryant, Patrice Koehl, Michael Levitt, An-
drew Ban, Johannes Rudolph, Lutz Kettner, Rachel
Brady, and Daniel Filip.
]
[This book is based on notes developed during teaching
the courses on “Sphere Geometry” in the Spring of 2000,
To do or think about (March 15, 2004).
General Fix the software for creating the index and

glossary.
Should the Exercise sections be labeled so the
page heading is more uniform?
Chapter III Section III.3: mention new results on
scheduling.
Exercises: add a few more questions.
Chapter V Should Section V.2 on Topological Per-
sistence be reorganized by first presenting the
algebra and second the algorithm?
In Section V.3: replace 23- by 03-, 13- and 23-
collapses.
Add the interface software description to Sec-
tion V.4.
Chapter VI Write Section VI.3 on Construction
and Simplification.
Write Section VI.4 on Simultaneous Critical
Points.
Exercises: come up with questions.
Chapter VII

In Section VII.2: find out about find-
ing the best bi-chromatic matching in .
Chapter VIII Write the introduction to Deformation.
Write Section VIII.1 on Molecular Dynamics.
Write Section VIII.2 on Spheres in Motion.
Write Section VIII.3 on Rigidity.
Write Section VIII.4 on Shape Space.
Chapter IX Exercises: come up with questions.
Chapter X Write a new chapter on area and volume
derivatives and related topics.
Write a section on the Weighted Area Deriva-
tive.
Write a section on the Weighted Volume
Derivative.
Chapter I
Bio-molecules
This chapter discusses the three main classes of organic We begin by describing the chemical structure of DNA
macromolecules involved in the hereditary and life main- and RNA in Section I.1. We then explain the translation
tenance mechanisms of living beings: DNA, RNA, and from RNA to proteins in Section I.2 and talk about the
proteins. According to the central dogma of biology, pro- structural organization of proteins in Section I.3. Finally,
teins are created in two steps from DNA, which carries the we present some of the fundamental premises and results
genetic information: of molecular mechanics in Section I.4.
DNA RNA Protein I.1 DNA and RNA

transcription translation I.2 Proteins and Amino Acids
I.3 Structural Organization
replication
I.4 Molecular Mechanics
Exercises
We talk briefly about the processes indicated by the three
arrows and focuses on the structure of the players in-
volved. DNA is the stuff that genetic material is made of.
RNA is mostly but not entirely an intermediate product
copying portions of the DNA (transcription) and turning
this information into working proteins (translation). Pro-
teins act like machines that define the cell cycle as an on-
going process. Each cell is like a society whose mem-
bers have specialized tasks, which they accomplish in a
complicated net of interactions. All mentioned molecules
are between large and huge. They are relatively simple
locally but exceedingly complicated in their totality. Be-
cause of the complexity and the large variety, it should
not be surprising that there are exceptions to almost ev-
erything meaningful that can be said about them. Perhaps
it is more surprising that anything of broad validity can be
said at all.
1
2 I B IO - MOLECULES
I.1 DNA and RNA indicate the total number of extra shared electrons. For ex-
ample, the hexagonal ring of cytosine has a total of eight
DNA (or deoxyribonucleic acid) is the material that forms covalent bonds, which we may think of as four thirds of a
the genome, which is a complete set of the genetic mate- covalent bond between every contiguous pair.
rial of a living organism. As discovered by Watson and
Crick in 1953, DNA consists of two strands of nucleotides NH2
twisted into the shape of a double helix, as depicted in Fig- C
N
ure I.1. We begin by looking at the small level and work C N
HC
O C CH
N N
−O P O CH2 O
adenine
−O C H C
H
phosphate
H C C H
OH H
deoxyribose sugar
O NH2 O
CH 3 C
C C
N HC N C NH
C NH
HC
C HC C HC C
N C N
N NH2 N O O
guanine cytosine thymine
Figure I.2: The chemical structure of the DNA nucleotide with

adenine as the nitrogenous basis above, and the chemical struc-
Figure I.1: A short piece of the DNA double-helix, with atoms ture of the other three nitrogenous bases below.
shown as tightly packed and partially overlapping spheres.
our way up the multi-scale structure of DNA. Compared Double helix. The two strands of DNA are held together
to standard genomics texts, the treatment of DNA in this by weak hydrogen bonds between complementary bases,
section is coarse and lacking of many important details. forming the structure of a spiraling staircase. The back-
bone of each strand is a repeating phosphate-deoxyribose
sugar polymer. The phosphate and the sugar groups in the
Chemical structure of DNA. DNA has three chemical
components: phosphate, deoxyribose sugar, and four ni- backbone are connected by phosphodiester bonds. The at-
tachment of these bonds to the sugar groups is illustrated
trogenous bases, namely adenine, guanine, cytosine, and
thymine. The first two bases are double-ring and the last in Figure I.3. The carbons of the sugar group are num-
two are single-ring structures. The chemical components bered from to . One part of the phosphodiester bond is
between the phosphate and the -carbon, and the other is
are arranged in groups called nucleotides, each composed
of a phosphate group, a deoxyribose sugar, and one of the between the phosphate and the -carbon. We think of the
backbone as oriented in the direction of the path that starts
four bases. A nucleotide is conveniently referred to by
at the -carbon, passes through the -carbon, and ends at
the first letter of its base. Figure I.2 sketches the chem-
ical structure of the nucleotide A and shows the chemi- the -carbon. In the double stranded DNA molecule, the
two backbones are in opposite, or anti-parallel, orienta-
cal structures of the remaining three bases. We obtain the
nucleotides G, C and T by substituting the corresponding tion.
base for adenine in Figure I.2. We use boldface edges to The bases are attached to the 1-carbons. Interactions
connect atoms that are joined by two covalent bonds. The between base pairs hold the two strands together. Adenine
covalent bonding in the ring structures of the nitrogenous interacts with thymine and guanine with cytosine. The two
bases is more interesting. All atoms in the ring share elec- bases of a pair are said to be complementary. This implies
trons as a group and we draw some double bonds just to that the sequence of bases along one strand determines the
I.1 DNA and RNA 3
O special protein). The beads of wrapped histones assume a

O
P
O coiled structure (a solenoid) stabilized by another type of
O histone that runs along its central axis. It takes one more
O H 5’H2
O P O HN
O
level of packaging to convert the solenoid into the three-
O 4’
O 1’
dimensional structure we call a chromosome. This higher
3’
2’ T NH N A 2’
3’
level uses a core scaffold made of another enzyme, topoi-
1’
4’
O somerase II. This enzyme has the ability to pass a strand
O O
P O of DNA through another, which is a much needed oper-
5’ H2 O
O 5’ H2
ation during packing and unpacking the DNA. The best
O P H O evidence suggests that the solenoid arranges in loops em-
O 4’
O O HN 1’ anating from the scaffold, which itself assume the form of
2’ 3’
3’ 2’ a spiral.
1’ G NH N C O
4’
O O
NH O P O
5’ H2 H O
O
Chemical structure of RNA. A gene is a subsequence
O
O P of the DNA capable of being transcribed to produce a
O functional RNA molecule. Note that this definition de-
pends on the rather complicated process of transcription,
Figure I.3: Chemical structure of a very short segment of DNA. which can fail for a variety of reasons. We begin by look-
The numbers to order the carbon atoms of each sugar group, ing at the chemical features of RNA. There are three main
giving each strand an orientation. The dotted connections be-
differences to DNA.
tween the nitrogenous bases indicate hydrogen bonds.
1. RNA is a single-stranded nucleotide chain and can

sequence of bases along the other: reverse the reading di- therefore assume a much greater variety of geometric
rection and replace each basis by its complement, shapes than DNA.
2. RNA has ribose sugar in its nucleotides, which dif-
5’ AATCGCGTACGCG 3’
fers from deoxyribose sugar by one additional oxy-
3’
TTAGCGCATGCGC
5’ gen atom.
Replication is based on this simple rule of complementar- 3. RNA nucleotides carry the bases adenine, guanine,
ity and makes essential use of the relatively weak bonds and cytosine, but substitute uracil for thymine found
between the two strands. A protein machine builds new in DNA. Uracil forms hydrogen bonds with adenine
DNA strands by separating the two old strands and com- just as thymine does.
plementing each by a new anti-parallel strand.
Figure I.4 illustrates the chemical difference between
RNA and DNA by showing a ribonucleotide containing
Chromosomes. Each cell of an organism contains a uracil.
copy of the entire genome. In the case of a human cell,
this amounts to about two meters of DNA partitioned into
O

twenty-three pairs of chromosomes per cell. The body has
about cells, totaling about meters of DNA, HC
C
NH
which is more than a hundred times the distance between
O HC C
the earth and the sun. Since humans are small relative to N O
−O P O CH2 O uracil
that distance, this implies that the DNA must be thin and
efficiently packed. Indeed, each chromosome is a long −O C H H C
thread (a double-strand) that is densely folded around pro- phosphate H H
C C
tein scaffolds.
OH OH
How is a long thread of DNA converted into the rel- ribose sugar
atively thick and worm-like structure visible through the
electron microscope? On the lowest level, the DNA is Figure I.4: Chemical structure of the RNA nucleotide with uracil
wrapped twice around a configuration of eight histones (a as the nitrogenous basis.
RNA is classified into different types depending on their A gene is thus not only marked but indeed defined by the
function. The vast majority is messenger RNA (or mRNA), promoter segment preceding and the terminating sequence
which acts as an intermediary structure in the synthesis succeeding it.
of proteins. There is also functional RNA produced by a
small number of genes, which is not translated into pro-
tein. Examples are transfer RNA (or tRNA), which brings Bibliographic notes. The idea that traits are hereditary
amino acids to the mRNA during the translation process, is old, but the detailed mechanism how it comes about
and ribosomal RNA (or rRNA), which helps coordinating started to unfold only recently. The groundwork for our
the assembly of amino acids to proteins. current understanding was laid in the nineteenth century
by Gregor Mendel, when he discovered the basic rules of
the hereditary mechanism [2]. An English translation of
Transcription. The transcription process, which makes this work can be found in [3]. It was long known that
RNA, is similar to the replication process of DNA. Dur- DNA is critically involved in that mechanism, but it took
ing the transcription of a gene, the two strands of DNA until the work of Watson and Crick in 1953 to discover the
are separated locally, and one strand acts as a template for chemical structure of DNA [5, 6]. The book by Watson [4]
RNA synthesis. Free ribonucleotides align along the DNA is an enjoyable personal account of the years preceding the
template. The process is catalyzed by another protein ma- discovery of that structure. Today there are many books on
chine, the RNA polymerase complex, which moves along the subject, and most of the material in this section is taken
the DNA adding ribonucleotides to the growing RNA, as from [1, Chapters 2 and 3].
sketched in Figure I.5. The resulting RNA sequence is
[1] A. J. F. G RIFFITH , W. M. G ELBART, J. H. M ILLER AND
S R. C. L EWONTIN . Modern Genetic Analysis. Freeman,
P
U 3’ S P S P S P 5’ New York, 1999.
A G C [2] G. M ENDEL . Versuche über Pflanzen-Hybriden. Verhand-
lungen des naturforschenden Vereines, Abhandlungen,
Brünn 4 (1866), 3–47.
A T C G
[3] C. S TERN AND E. R. S HERWOOD . The Origin of Genetics:
5’ P S P S P S P S 3’
A Mendel Source Book. Freeman, 1966.
Figure I.5: The RNA grows in the 5’ to 3’ direction, in this case [4] J. D. WATSON . The Double Helix. Antheneum, New York,
by adding a nucleotide carrying uracil to the chain. 1981.
the same as the non-template sequence of the gene, except [5] J. D. WATSON AND F. H. C. C RICK . Molecular structure
that U replaces T. Electron microscope pictures show that of nucleic acid. A structure for deoxyribose nucleic acid.
Nature 171 (1953), 737–738.
the transcription of DNA to RNA is a highly parallel pro-
cess in which a row of RNA polymerase complexes follow [6] J. D. WATSON AND F. H. C. C RICK . Genetic implica-
each other along the gene and produce RNA concurrently. tions of the structure of deoxyribonucleic acid. Nature 171
Each individual transcription works in three steps. (1953), 964–967.
Initiation. RNA polymerase binds to a promoter segment

of DNA located in front of the gene. It then un-
winds the DNA and begins the synthesis of an RNA
molecule.
Elongation. RNA polymerase moves along the DNA,
maintaining a transcription bubble to expose the tem-
plate strand. It compares free ribonucleotides with
the next exposed DNA basis and adds a complemen-
tary match.
Termination. Specific sequences in the DNA signal the
chain termination by triggering the release of the
RNA strand and the polymerase.
I.2 Proteins and Amino Acids 5
I.2 Proteins and Amino Acids Amino acids. Among a much larger variety of amino
acids, nature uses only twenty to build proteins. We
Proteins are polypeptide chains obtained by translation list their names together with their three-letter codes and
from strands of messenger RNA. In this section, we sketch single-letter abbreviations in Table I.1. As can be seen in
the translation process and discuss the chemical structure
Alanine Ala A Methionine Met M
of proteins.
Cysteine Cys C Asparagine Asn N
Aspartate Asp D Proline Pro P
Chemical structure. A protein is a linear sequence of Glutamate Glu E Glutamine Gln Q
Phenylalanine Phe F Arginine Arg R
amino acids connected to each other by peptide bonds.
Glycine Gly G Serine Ser S
Each amino acid consists of a central carbon atom, the -
Histidine His H Threonine Thr T
carbon, linked to an amino group, a carboxyl group, one Isoleucine Ile I Valine Val V
hydrogen atom, and a side-chain. Amino acids that are Lysine Lys K Tryptophan Trp W
linked into a polypeptide chain are referred to as residues. Leucine Leu L Tyrosine Tyr Y
Different residues are distinguished by their side-chains.
As shown in Figure I.6, two amino acids are linked by a Table I.1: Names, codes and abbreviations of the twenty amino
peptide bond whose creation releases water. The result- acids that occur as building blocks of natural proteins.
ing repeating sequence of nitrogen, -carbon and carbon
atoms is the backbone of the protein. Figures I.8 and I.9, residues differ widely in size and struc-
ture. The fifteen amino acids sketched in Figure I.8 may
be viewed as trees rooted at the -carbon, which is part
H H
H O H O of the backbone. Most of the internal nodes are carbon
N C C + N C C atoms, with rare occurrences of oxygen, nitrogen and sul-
H OH H OH fur atoms. As before, we mark double and partially dou-
R R
ble bonds by boldface edges. Four of the five amino acids
OH2
H O H
H O
N C C N C C
H OH
R H R
O N
Valine Isoleucine
Figure I.6: Two amino acid residues joined by a peptide bond.
Leucine Asparagine
The four neighbors of an -carbon, C , are at the vertex
positions of a tetrahedron around C . This tetrahedron has

Glycine O
two orientations, one being the mirror image of the other, Alanine O S
Threonine
as illustrated in Figure I.7. The two oriented forms are O O
Aspartate Serine Cysteine
referred to as isomers and distinguished by letters L and
D. Only L-amino acids occur in nature as building blocks
of proteins.
N
NH2 COOH COOH NH 2 S
N O N
N N
O O
Cα Cα
Arginine Lysine Methionine Glutamate Glutamine
H R R H
Figure I.8: The fifteen amino acids without cycle in their chemi-
L D cal structure. The shaded circle is the -carbon on the backbone.

All unlabeled nodes are either carbon or hydrogen atoms.

Figure I.7: The two isomers of an amino acid.
sketched in Figure I.9 have pentagonal and hexagonal ring
structures. The fifth amino acid is proline, which forms a The translation is accomplished by transfer RNA
cycle by having its chain connect back to the nitrogen next molecules that recognize codons through the same binding
to the -carbon along the backbone. This unique feature mechanism used for replication and transcription. Some
locally restricts the flexibility of the backbone, as will be residues correspond to more codons than others. The re-
discussed in Section I.3. dundancy is in part due to multiple tRNA molecules car-
rying the same residue and in part because there is flexi-
bility in how the tRNA reads the codons. In many cases,
N an accurate match at the first two positions suffices and a
mismatch at the third position can be tolerated. This ex-
plains the relative uniformity among the four residues in
Proline
any one slot of Table I.2.
N
Since codons are triplets of nucleotides, there are ap-
Tryptophan
parently three possible reading frames, each producing an
entirely different residue sequence. The correct reading
frame is identified by starting the translation always at a
start codon, AUG. The initiator tRNA is a specific transfer
O RNA that recognizes this sequence and binds to methion-
N ine. Incidentally, it differs from the tRNA that binds to the
O
N
AUG codon in the middle of the sequence, although that
Tyrosine Phenylalanine Histidine one also binds to methionine.
Figure I.9: The five amino acids with cyclic chemical structure.
Translation. As mentioned above, the tRNA molecules
are instrumental in translating codons into residues. Each
tRNA is a short sequence of about 80 nucleotides. Com-
Genetic code. The translation process is more involved plementary subsequences form double-helix substructures
than transcription because it converts information between that further fold up to characteristic ‘clover leaf’ forma-
two languages that use different alphabets. The sequence tions, one of which is sketched in Figure I.10. A tRNA
of nucleotides is read consecutively in groups of three,

called codons. Since there are four different types of nu-
cleotides, we have codons. There are only twenty amino
acid
residues, which implies that the map is not injective but
3’
uses redundancy to reduce the number of outcomes. The
complete map is shown in Table I.2. The codon XYZ is
5’
A G C U G C
C G
A Lys Lys Arg Arg Thr Thr Ile Met G C
G C
Asn Asn Ser Ser Thr Thr Ile Ile A U
G Glu Glu Gly Gly Ala Ala Val Val U A
U A
Asp Asp Gly Gly Ala Ala Val Val G A C A C
C Gln Gln Arg Arg Pro Pro Leu Leu C U C G
C U G U G
His His Arg Arg Pro Pro Leu Leu G A G C
U Trp Ser Ser Leu Leu C G

C G
Tyr Tyr Cys Cys Ser Ser Phe Phe A U
G C
Table I.2: The genetic code. The start codon is AUG and maps to
methionine. Empty entries correspond to the stop codons, which
are UAA, UAG, and UGA. anti−codon GAA
mapped to one of the residues in the row of X and the col- Figure I.10: Transfer RNA with anti-codon at the bottom, cova-
umn of Y. The four positions inside that slot correspond to lently attached amino acid at the top, and complementary sub-
A, G in the first row and C, U in the second row. strings shown.
I.2 Proteins and Amino Acids 7
molecule matches the exposed codon of the mRNA with [4] N. J. DARBY AND T. E. C REIGHTON . Protein Structure.
its anti-codon and contributes its residue to the polypep- Oxford Univ. Press, England, 1993.
tide chain that grows at the other end. The codon and anti-
[5] P. C. E. M OODY AND A. J. W ILKINSON . Protein Engi-
codon are matched in anti-parallel orientation, as always. neering. Oxford Univ. Press, England, 1990.
The translation process is facilitated by the ribosome,
[6] L. S TRYER . Biochemistry. Third edition, Freeman, New
which is a large complex made from more than 50 dif-
York, 1988.
ferent proteins and several RNA molecules. It consists
of a small subunit and a large subunit, which come to-
gether around an mRNA strand with the help of the ini-
tiator tRNA that contributes the first residue. The ribo-
some scans through the strand like a tape reader. For each
codon, it finds a tRNA with matching anti-codon and ap-
pends its amino acid as a residue to the carboxyl end of the
growing polypeptide chain. The orientation of the mRNA
strand from the 5- to the 3-end is thus preserved by the
orientation of the polypeptide chain from the amino group
of the first to the carboxyl group of the last residue. The
translation process ends when a stop codon is read. The
protein chain and the mRNA are released and the ribo-
some dissociates into its two subunits.
Similar to transcription, the translation of an mRNA
strand into a protein happens in parallel, with several ri-
bosomes working concurrently and in sequence along the
strand. In some cases, the translation even starts during
transcription, before the mRNA strand is complete.
Bibliographic notes. Most of the twenty amino acids

that occur in proteins have been identified in the nineteenth
century. After the determination of the DNA structure in
1953, it took only a few years for the community to agree
on the central dogma, and a few more years to decipher the
genetic code on which the dogma is based. The geomet-
ric structure of the ribosome has recently been resolved by
x-ray crystallography [2]. The material of this section is
taken from [1, 3, 6], all three of which are comprehensive
texts in their respective fields. Considerably shorter and
more focussed descriptions of proteins and protein struc-
tures can be found in [4, 5].
[1] B. A LBERTS , D. B RAY, A. J OHNSON , J. L EWIS , M.

R AFF , K. ROBERTS AND P. WALTER . Essential Cell Bi-
ology. An Introduction to the Molecular Biology of the Cell.
Garland, New York, 1998.
[2] N. BAN , P. N IESSEN , J. H ANSEN , P. B. M OORE AND T.

A. S TEITZ . The complete atomic structure of the large ribo-
somal subunit at Å resolution. Science 11 (2000), 878–
879.
[3] T. E. C REIGHTON . Proteins: Structures and Molecular

Properties. Second edition, Freeman, New York, 1993.
I.3 Structural Organization are physically prohibited collisions between atoms. A

larger residue will generally prohibit a larger range of
We cannot hope to understand proteins without a good angles than a smaller one. The realizable angle pairs
grasp of their multi-level structural organization. Most are
visualized as a subset of the square of angle pairs,
surprisingly, same proteins fold up to same shapes, and
. This so-called Ramachandran plot for
this is really the reason why geometry plays an important glycine is sketched in Figure I.12. The side-chain of
role in their study.
ψ
Bond rotation. Consider the three bonds from one -

carbon to the next along a protein backbone, and refer to it
as a peptide unit. Figure I.6 shows its chemical and Figure
I.11 its geometric structure. Because of partial double-
Cα
φ
O N
C H
ψ H
H Figure I.12: The square represents all angle pairs and the
Cα
N shading indicates the region of disallowed pairs for glycine.
φ
Cβ
glycine is only H, which is the reason that a relatively
C
large portion of the square of angle pairs is realizable. An
Cα
O interesting residue in this respect is proline, which differs
from all others because it binds back to the backbone, and
Figure I.11: The planarity of a peptide bond is caused by its in this way restricts the rotational degree of freedom to a
partial double-bond character. The and angles measure rota- small region.
tions around the bonds preceding and succeeding every -carbon
atom.
Two common motifs. A motif that is commonly ob-
bond character, there is no freedom to rotate around the served in proteins is the -helix, whose backbone forms
peptide bond, which is the link between the carbon and the a right-handed helix. Contiguous -carbons are separated
nitrogen atoms. There are however two possibly planar by about in the rotation direction and Å rise,
configurations: the trans form, in which C -C-N-C is
relatively stretched (zig-zag), and the cis form, in which

which is measured along the axis. A rotation takes about
residues and produces an axial separation of about
it curves in one direction (zig-zig). The two forms are Å. The structure is stabilized by hydrogen bonds be-
distinguished by the rotation angle along the C-N bond, tween every CO group and the NH group four residues
, which by convention is for the trans and for
later. All side-chains lie outside the helix structure. The
the cis form. In contrast, the links between the -carbon characteristic dihedral angles for a right-handed -helix
and the carbon and nitrogen atoms are single bonds with are roughly and . Cartoon repre-
one-dimensional rotational degrees of freedom. As shown sentations of protein structures usually draw -helices as
in Figure I.11, measures the rotation around the N-C
tubes. In Figure I.13 the tubes are visible as spiral sections

bond, and measures the rotation around the C -C bond.
of the ribbon.
Again by convention,
and for the two
coplanar trans forms. Another recurring motif are -sheets, which are flat and
made up of several strands. A strand can be obtained by
stretching the -helix until the axial distance between two
Ramachandran plot. The conformation of the back- contiguous -carbons reaches about Å. The stabilizing
bone is completely determined when , , and are spec- hydrogen bonds are between neighboring strands, which
ified for each residue in the chain. A given residue pro- can run in the same direction (parallel) or in opposite di-
hibits some angles because of steric hindrances, which rections (anti-parallel). They combine strands to sheets.
I.3 Structural Organization 9
Quaternary structure refers to the spatial arrangement of

subunits of a protein.
A single protein may indeed contain more than one

polypeptide chain. Each chain forms what we call a sub-
unit, and quaternary structure addresses questions about
their relative position and interaction. The description
of quaternary structure includes the rather weak van der
Waals forces, which affect atoms in short distance (within
about Å). Although this force is weak compared to oth-
ers, its accumulated influence is significant if two subunits
have geometrically complementary shapes that permit a
large number of atom pairs within the reach of the force.
This accumulated effect thus prefers interactions between
geometrically complementary shapes. In biology, this fact
is expressed by saying that the van der Waals force creates
specificity in the interaction. That specificity plays a dom-
inant role also in protein-protein and in protein-ligand in-
Figure I.13: Ribbon diagrams visualize proteins by emphasizing teractions. A protein typically has a few regions embedded
the backbones as it winds its way through the structure. in its surface, so-called active sites, that are specific to in-
teractions with other molecules. While active sites usually
occupy only a small fraction of the surface, they decide
Both options are illustrated in Figure I.14. protein function. Evidence for that claim can be provided
by mutating a protein and distinguishing between muta-
CO CO CO HN tions that preserve and that change the active sites.
Cα Cα Cα Cα
NH NH NH OC Structure determination. Even though proteins are
OC OC OC NH large molecules that typically consist of a few thousand
Cα Cα Cα Cα atoms, they are not visible under an electron microscope.
HN HN HN OC
How do we then know anything about the structural or-
ganization of proteins? The primary source today are x-
CO CO CO HN
ray diffractions from protein crystals, but there are others
Cα Cα Cα Cα and most notably images generated from nuclear magnetic
NH NH NH OC resonance (or NMR) experiments. Both methods are com-
plicated and laborious. We only scratch the surface by ex-
Figure I.14: Two parallel -strands to the left and two anti- plaining the principle steps in the reconstruction of protein
parallel ones to the right. The dotted edges represent stabilizing structures from x-ray diffractions:
hydrogen bonds.
1. Prepare a protein crystal.
2. Expose the crystal to x-ray beams and collect the
Protein architecture and function. It is common to diffractions.
distinguish four levels of organization in the description
3. Compute the electron density and from it derive the
of protein architecture:
structure.
Primary structure refers to the sequence of residues along
The x-ray experiment does not determine the element
the oriented polypeptide chain.
identities of the atoms, which have to be obtained from the
Secondary structure refers to the spatial arrangement of known chemical structure threaded into the density. Since
residues that are near each other along the chain. there are probably hundreds of thousands of different pro-
Tertiary structure refers to the spatial arrangement of teins, it would be desirable to automate the process. It
residues that are far from each other along the chain. seems that Step 1 is the main obstacle in reaching this goal,
ATOM N ARG

in part because some proteins are not known to form crys- ATOM CA ARG

tals at all. Step 2 requires an x-ray source, a device to ro- ATOM C ARG

tate the crystal by small angles ( or less), and a detec- ATOM O ARG

tion device. For each angle, we get a two-dimensional pic- ATOM CB ARG

ATOM CG ARG

ture of diffractions. The three-dimensional electron den- ATOM CD ARG
sity is computed from a whole array of such pictures. A ATOM NE ARG
typical level surface of an electron density is shown in Fig- ATOM CZ ARG

ATOM NH1 ARG

ure I.15. The main mathematical tool in the construction
ATOM NH2 ARG

Table I.3: Incomplete records of the atoms that belong to an argi-

nine residue. CA is the -carbon atom, CB the -carbon, etc.

Bibliographic notes. The Ramachandran plot for real-

izable bond rotations goes back to work by Ramachan-
dran and Sasisekharan [6]. The -helix has been sug-
gested as a common motif in proteins by Pauling and col-
laborators in 1951 [4], and in the same year they also
identified the -sheet [3]. This was a few years before
these motifs had been observed in x-ray experiments. In
the late 1950s, Max Perutz reconstructed the structure
of hemoglobin from x-ray diffraction data [5], and John
Kendrew did the same for myoglobin. A classic text on
the x-ray crystallography method is [2]. The material on
x-ray crystallography and PDB files presented in this sec-
tion is taken from [1].
Figure I.15: The so-called chicken wire representation of a level
surface of a three-dimensional density.
[1] L. J. BANASZAK . Foundations of Structural Biology. Aca-
demic Press, San Diego, California, 2000.
of the electron density is the Fourier transform. A fun-
damental difficulty in this step is that only the amplitudes [2] T. B LUNDELL AND L. J OHNSON . Protein Crystallography.
(intensities) of the waveforms are observable, while the Academic Press, New York, 1976.
phase information must be obtained by different means.
[3] L. PAULING AND R. B. C OREY. Configurations of poly-
peptide chains with favored orientations around single
Protein data banks. After completing the structural bonds: two new pleated sheets. Proc. Natl. Acad. Sci. USA
37 (1951), 729–740.
study of a crystallized protein, investigators usually send
their results to the Protein Data Base, which is a public [4] L. PAULING , R. B. C OREY AND H. R. B RONSON . The
repository of protein structures described in so-called PDB structure of proteins: two hydrogen-bonded helical configu-
files. At the beginning of each file we find ancillary infor- rations of the polypeptide chain. Proc. Natl. Acad. Sci. USA
mation, including the header, the name of the protein, the 37 (1951), 205–211.
author, the reference to the corresponding journal article,
[5] M. F. P ERUTZ . X-ray analysis of hemoglobin. Lex Prix No-
etc. There is also information about non-standard compo- bel, Stockholm, 1963.
nents and about secondary structure elements. The main
body of the file lists the coordinates of the observed atoms. [6] G. N. R AMACHANDRAN AND V. S ASISEKHARAN . Stereo-
They are always given in an orthonormal coordinate sys- chemistry of polypeptide chain configurations. J. Mol. Biol.
tem, in which the length unit is one angstrom. Table I.3 7 (1963), 95–99.
illustrates the format by showing a small portion of a PDB
file for hemoglobin, listing the coordinates of the atoms
of an arginine residue. Note that there are no hydrogen
atoms, since they are too small to be resolved by an x-ray
experiment.
I.4 Molecular Mechanics 11
I.4 Molecular Mechanics the Avogadro’s number of its atoms. In other words, if the
mass of one atom of that element is daltons then the
After a protein has been created by translation, it folds mass of one mole is grams. Table I.4 lists properties of
into a shape, or conformation, that is determined by its elements that are commonly found in organic matter.
sequence of residues. The folding process is a reaction to
element #p #n electron shells
a multitude of forces that simultaneously act on every part
of the protein. This section presents some of the current Hydrogen H 1 0 .
Carbon C 6 6 .. ....
knowledge and efforts to model these forces. We begin
Nitrogen N 7 7 .. .....
by studying atoms and discuss covalent and non-covalent
Oxygen O 8 8 .. ......
forces. Sodium Na 11 12 .. ........ .
Magnesium Mg 12 12 .. ........ ..
Phosphorus P 15 16 .. ........ .....
Atoms. Each atom has a positively charged massive Sulfur S 16 16 .. ........ ......
nucleus, which is surrounded by a cloud of negatively Chlorine Cl 17 18 .. ........ .......
charged electrons. The nucleus consists of protons, each Potassium K 19 20 .. ........ ........ .
contributing a unit positive charge, and of electronically Calcium Ca 20 20 .. ........ ........ ..
neutral neutrons. The electrons are held in orbit by elec-
trostatic attraction to the nucleus. Each electron has one Table I.4: Some elements together with their numbers of pro-
unit of negative charge, which exactly neutralizes the pos- tons, neutrons and electrons distributed in the shells around the
nucleus.
itive charge of one proton. In total, we have the same
number of protons and electrons and thus an electroni-
cally neutral atom, as illustrated in Figure I.16. Different
Covalent bonds. According to the Born model, elec-
trons live in shells around the nucleus and populate in-
- - - - ner shells before using outer ones. The first three shells
from inside out can hold up to 2, 8 and 8 electrons, as in-
+ dicated in Table I.4. The chemical properties of an atom
+ +
+ + are defined by the tendency to either empty or complete
+ + its partially incomplete shell, if any. One way of doing
that is by sharing electrons. The shared electrons com-
- - -
plete the outermost non-empty shells of both atoms in-
volved. According to Table I.4, carbon, nitrogen and oxy-
gen need four, three and two electrons to fill their outer
Figure I.16: A schematic picture of a hydrogen atom to the left
and a carbon atom to the right. shells. As illustrated in Figure I.17, this can for exam-
ple be done by covalently binding to the same number
of hydrogen atoms. We can now define a molecule as a
elements consist of atoms with different numbers of pro-
tons. The atomic number is by definition the number of
protons, which is also the number of electrons. The num- - -
ber of neutrons is usually about the same because too few
or too many neutrons destabilize the nucleus. The atomic + +
+
weight is the ratio of its mass over the mass of a single
hydrogen atom. Because the mass of an electron is negli-
+ +
gible, the atomic weight is almost exactly the number of
protons plus the number of neutrons.
Figure I.17: The geometry of covalent bonding for carbon, nitro-
Avogadro’s number is useful in translating from the gen, and oxygen.
miniscule world of single atoms into a humanly more ac-

cessible scale. It is the number of hydrogen atoms in one
gram of hydrogen, which is roughly . The mass
connected component of the graph whose vertices are the
atoms and whose edges are the covalent bonds. When an
of one hydrogen atom is therefore gram which, atom covalently bonds to more than one other atom, then
by definition, is one dalton. One mole of an element is there is a preferred angle between pairs of bonds. For ex-
ample for carbon, this angle is what we get by connecting der Waals interaction. Experimental observations point to
the centroid of a regular tetrahedron with two of the ver- a potential energy function roughly as graphed in Figure
tices.

. Two atoms can
Using elementary geometry we find this angle is I.18. The corresponding force is the negative derivative,
also form a covalent double bond, which forces the nu-
energy
clei closer together and is stronger than the corresponding
single bond. It also prevents any torsional rotation around
that bond, which is possible for single bonds. We need
a sequence of four atoms and three covalent bonds to de-
fine the torsional angle of the middle bond. It is gener-
ally parametrized such that corresponds to the trans distance
(zig-zag) coplanar configuration. For example for H C-
CH , we have three bonds on each side of the middle
bond. There is an energetic preference for staggering the
covalent bonds on the two sides, which corresponds to tor-
sional angles of ,
, and . Figure I.18: The van der Waals force is obtained by adding the at-
tractive force (derivative of dashed curve) and the repulsive force
When two atoms that covalently bond are of different (derivative of the dotted curve).
type then they generally attract the shared electron to dif-
ferent degrees. The shared electrons will therefore have a which is interpreted as a balance between an attractive
bias towards one end of the structure or another. We then and a repulsive force. The attraction is due to a disper-
have a polar structure in which the positive charge is con- sive force that can be explained using quantum mechanics.
centrated on one end and the negative charge on the other. The repulsion also has a quantum mechanical explanation
Examples of polar covalent bonds are between hydrogen in terms of the Pauli principle, which prohibits any two
and oxygen and between hydrogen and nitrogen, as illus- electrons from having the same set of quantum numbers.
trated in Figure I.17. In contrast, the bond between hy-
drogen and carbon has the electrons attracted much more It is useful to keep the relative strengths of the various
equally and is relatively non-polar. forces in mind. Table I.5 gives estimates of the amount of
energy necessary to break one mole of bonds.
bond type strength in

Non-covalent bonds. An atom can also donate an elec- vacuum water
tron to another atom and thus create a complete outer covalent 90.0 90.0
shell. An example is sodium donating the only electron ionic 90.0 3.0
in its third shell to chlorine, which uses it to complete its hydrogen 4.0 1.0
third shell. As a result we get positively charged sodium van der Waals 0.1 0.1
cations and negatively charged chloride anions. Both are
attracted to each other by electrostatic force and form a Table I.5: Relative strength measured in kilo-calories per mole
regular grid packing, in which each sodium cation is sur- necessary to break the bonds. Water molecules interfere with
ionic and hydrogen bonds, which are therefore considerably
rounded by six chloride anions, and vice versa. These
weaker in a solution than in a vacuum.
arrangements are known as table salt. A weaker inter-
action, also based on electrostatic force, is generated by
polar molecules. A prime example is water, which is par-
tially positively charged at the two hydrogen ends. Wa-
Force field. To get a handle on how molecules move,
ter molecules thus tend to aggregate in small semi-regular we define the potential energy of a system of atoms. The
structures, but this force is weak and bonds of this kind
general assumption is that the system develops towards a
are constantly formed and broken. The polarity of wa- minimum. To model the potential energy accurately, we
ter molecules is the basis for the difference between hy- would have to work with quantum mechanics, which is
drophilic molecules, that are polar and therefore attract
beyond the scope of this book and also beyond the capabil-
water, and hydrophobic molecules, that are non-polar and ities of current computations for large organic molecules.
do not attract water.
The alternative is molecular mechanics, which uses classi-
Another non-covalent force is responsible for the van cal mechanics to model the forces that act on atoms. The
I.4 Molecular Mechanics 13
simplest such model sums five contributions to the poten- It is clear that
as defined is only a rough approxima-

tial energy, three accounting for covalent bonds and two tion of the real potential energy that drives the behavior

for non-covalent bonds. We use a vector to de- of the system. Whether or not that approximation suffices

scribe the state of a system of atoms and define the po- depends on what we use it for.
tential energy as a function . In its simplest
form, that energy is written as

Molecular dynamics. One of the applications of force

fields is the simulation of molecular motion. Let

*+
bonds
-* , .02. /1 ,
be the trajectory of a point with mass . Its location at

, * ,
time is , its velocity is

, and its momen-
angles

tum is 3-* ,
. Recall Newton’s three laws of motion:

1. A body continues to move in a straight line at con-

torsions
stant velocity unless a force acts upon it.
atoms 2. The rate of change of the momentum equals the force.
"$# # )(

! !
3. To every action there is an equal and opposing reac-

!&% ' %
tion.
atoms
4 , 65* , 7.90. 8:1 8/ ,
The rate of change of the velocity is also referred to as

the acceleration, . Newton’s sec-
This formula contains various constants that depend on the
ond law can now be written as ;4 , /=<>1 ?

@/ <>1 ?

* ,
type of atom or interaction involved. We briefly look at , where

each one of the five terms. is the force acting upon . Suppose we
/ BAC / D of a potential

write the force as the negative gradient func-
tion: , for some . Using this

Bond length. The first sum approximates the energy
* 5 , BAE /=<>1 ?

by a quadratic function. The strength

penalty for differing from the reference length,
is relatively
, notation, Newton’s second law is expressed by the differ-
ential equation

. A trajectory is a so-
large, namely several hundred kilo-calories per mole. lution to this equation. In simple cases, the trajectory can
* GFH*IF
be computed analytically. For example, if the potential is
Bond angle. The second sum approximates the energy

stationary and equal to one over the norm,
/ BAC / J* GFH*KF
,
penalty for differing from the reference angle, ,
then . In this case, the generic
again by a quadratic function. The strength, , is
considerably less than for bond length, namely about trajectory is an ellipse with one focus at the origin, as illus-
one one-hundredth or even less. trated in Figure I.19. Both the gravitational and the elec-
trostatic potentials have this form.
Torsional rotation. The third sum approximates the en-
ergy for different torsional angles around a bond. An-
gles that lead to staggered arrangements of bonds at

both sides are energetically preferred. This prefer-
ence is modeled by a cosine function with minima
and the same number of maxima.
Electrostatic interaction. The forth sum adds the electro-

static potential between every pair of atoms in the
system. The constants and are the charges, is
the dielectric constant of the medium, and is the

distance between the two atoms.
Van der Waals interaction. The fifth sum approximates Figure I.19: A generic trajectory when the magnitude of the at-
!
the van der Waals potential by the Lennard-Jones 12-
traction to the origin decreases with the square distance.
6 function. The collision constant,

, marks where
the function crosses the zero line, and
value at the unique minimum. As before,

is the
is the
The problem in molecular dynamics is significantly
more involved. We have bodies (atoms) and the energy
distance between the two atoms. potential and force depend on the momentary locations of

all bodies. As before, we
represent the collection of putational biology. Numerical algorithms for molecular

atoms by a point . The energy potential is the dynamics can be found in Leach [4] and Schlick [6].

BAC
function defined earlier, and the force act-
ing on is . Newton’s second law of motion [1] N. W. A SHCROFT AND N. D. M ERMIN . Solid State
can now be written as Physics. Harcourt Brace, Orlando, Florida, 1976.
5 BAC
[2] A. B ONDI . Molecular Crystals, Liquids and Glasses. Wiley,

where the mass vector
New York, 1968.
multiplies each compo- [3] W. L. J ORGENSEN AND J. T IRADO -R IVES . The OPLS po-
nent of the acceleration vector with the mass of the corre- tential functions for proteins. Energy minimization for crys-

sponding atom. The classic two-body problem is the spe- tals of cyclic peptides and crambin. J. Amer. Chem. Soc. 110
cial case in which and is the sum of the two (1988), 1657-1666.
corresponding gravitational potentials. In this case, the
[4] A. R. L EACH . Molecular Modeling. Principles and Appli-
generic trajectories are again ellipses. Already for three
cations. Longman, Harlow, England, 1996.
bodies, there is no analytic solution and one has to resort
to numerical methods to approximate the trajectories. The [5] F. L ONDON . Zur Theorie und Systematik der Moleku-
problem in molecular dynamics is even more difficult be- larkräfte. Zeitschrift für Physik 63 (1930), 245–279.
cause the potential function is considerably more compli-
[6] T. S CHLICK . Molecular Modeling and Simulation.
cated than a sum of gravitational potentials. The currently
Springer-Verlag, New York, 2002.
available numerical solutions are inadequate to simulate
the entire folding process even for small proteins. One of [7] J. T SAI , R. TAYLOR , C. C HOTHIA AND M. G ERSTEIN .
the difficulties in the simulation is the near cancellation of The packing density in proteins: standard radii and volumes.
large forces so that relatively weak residuals gain a deci- J. Mol. Biol. 290 (1999), 253–266.
sive influence. Even small inaccuracies in the model or the
computation can lead to false decisions and possibly spoil
the entire remainder of the simulation.
Bibliographic notes. The first half of this section is a

highly simplified introduction of atoms and bonds. The
material on force fields is taken from Leach [4]. The van
der Waals potential derives its name from the work of van
der Waals, who quantified the deviation of rare gas from
ideal gas behavior. The origin of the force is a fluctuation
of electrostatic charge in atoms, and we refer to physics
texts such as [1, Chapters 19 and 20] for further details.
The explanation of the dispersive contribution in terms of
quantum mechanics is due to London [5].
To determine the constants needed to parametrize the
mathematical formulation of a force field is far from triv-
ial. The definition of the van der Waals radii used to
parametrize the Lennard-Jones functions is just one ex-
ample. There are various approaches to determine these
radii. Bondi [2] looks for the distances of closest ap-
proach between atoms to determine van der Waals radii.
Jorgensen and Tirado-Rives [3] derive parameters in an at-
tempt to reproduce thermodynamic properties in computer
simulations. Finally, Tsai et al. [7] analyse the most com-
mon distances between atoms in small molecule crystals
in the Cambridge Structural Database. Simulating motion
with molecular dynamics is an important topic in com-
Exercises 15
Exercises Base (www.rcsb.org/pdb) and the Swiss Bioin-

formatics Center (expasy.hcuge.ch).
1. Palindromic Sequences. Call a single strand of (i) Download a PDB file from either data base and
DNA a palindromic sequence if it the same as the extract the string of single-letter abbreviations
the complementary strand read backwards. describing the amino acid sequence.
(i) Given a strand, how would you determine (ii) Is the relative frequency of amino acids you ob-
whether or not it is a palindromic sequence? serve related to the relative number of codons
(ii) Give an algorithm that finds the longest subse- that encode them?
quence that is palindromic. 6. Ramachandran Plot. Download a PDB file and ex-
2. Counting strings. A double-strand of DNA has no tract the sequence of and angles along the back-
preferred direction, but we can orient it so one direc- bone. Draw the result in form of a Ramachandran
tion is forward and the other is backward. In either plot.
direction, we read the strand in the to direction, 7. Regular Tetrahedron. A regular tetrahedron has
as usual. Call two linear or cyclic pieces of double- four equilateral triangles as faces, which meet along
stranded DNA the same if they can be oriented so we six equally long edges.
read the same string of nucleotides in the two forward
directions. (i) Determine the dihedral angle formed by two
faces meeting along a common edge.

(i) How many different linear pieces of double-
(ii) Determine the solid angle formed by three faces
stranded DNA of length are there?
meeting at a common vertex.

(ii) How many different cyclic pieces of double-
stranded DNA of length are there? [By convention, the full dihedral angle is
, which
is the length of the unit circle, and the full solid angle
[Beware of palindromic sequences.] is , which is the area of the unit sphere.]
3. Amino Acids. Draw the graph whose nodes are the

* FH*IF
8. Elliptic Trajectory. Let the energy potential
acyclic amino acids that has an arc connecting two
/ BAC / J* GFH*KF
be defined by
* . The force it
nodes iff one amino acid can be obtained from the exerts on a point is .
other by the replacement or addition of a single atom. Prove that the generic trajectory in this force field is
an ellipse centered at the origin.
(i) Is the graph connected?
(ii) Does every connected component have a path
that passes through every node exactly once?
4. Lattices. The arrangement of atoms in a folded pro-

tein is often compared to that in a crystal lattices.
Sketch two such lattices by drawing the atoms as
points and connecting neighboring atoms by straight
edges.
(i) The face-centered cube (or FCC) lattice con-
*
sisting of all points with integer coordinates

*
whose sum is even: such that
.
(ii) The body-centered cube (or BCC) lattice con-
*

sisting of all points will all even or all odd

*
integer coordinates: such that
*
or .
5. Structure Repositories. Descriptions of protein

structures are publically available at the Protein Data
Chapter II
Geometric Models
A surprising finding in the research on proteins is the so, we develop a language suitable for studying details of
importance of geometric shape in their functioning. By our models. In Section II.3, we introduce alpha shapes,
and large, the shape seems to determine how proteins in- which are dual to space-filling diagrams and are our pre-
teract with each other and with other molecules. This find- ferred computational representation. Finally in Section
ing is usually expressed as a causal chain of responsibili- II.4, we talk about the Alpha Shape software and discuss
ties: how it can be used.

S EQUENCE S HAPE F UNCTION
A protein is a peptide chain of amino acids that folds up II.1 Space-filling Diagrams
and forms a shape. In a natural environment, like proteins II.2 Power Diagrams
fold up to same shapes, but this might be a result of evolu- II.3 Alpha Shapes
tionary selection. The details of that shape in terms of its II.4 Alpha Shape Software
cavities, protrusions, dynamics, and energetics determine Exercises
how it interacts with other molecules.
At the current stage of our biological knowledge, there
is an overwhelming accumulation of sequence informa-
tion, which is due, in part, to the near completion of sev-
eral large-scale genome projects. Although the number
of proteins for which the three-dimensional structure has
been resolved and is stored in the Protein Data Base is in
the thousands, this is only a small fraction of the wealth of
available sequence information. The goal of studying the
geometry of proteins is therefore two-fold: the develop-
ment of new computational tools to help determine or re-
fine structure information and understanding the relation-
ship between shape and function.
In this chapter, we introduce some of the basic geomet-
ric models useful in representing molecular shape. We
have seen the bio-chemist’s view in Chapter I, who aims
at pruning the immense variety by limiting attention to
physically or chemically likely configurations. The rest
of this books takes a complementary view by concentrat-
ing on mathematical models and computational data struc-
tures that arise in the study of proteins. In Section II.1, we
introduce space-filling diagrams as the primary geometric
model of molecules. In Section II.2, we use Voronoi dia-
grams to decompose space-filling diagrams, and in doing
17
18 II G EOMETRIC M ODELS
II.1 Space-filling Diagrams

ter of the circle thus traces out a curve at distance away
obtained by growing every disk

from the boundary. This curve is the boundary of
to radius .

A space-filling diagram associates a molecule with a por-
tion of the three-dimensional space it occupies. The tacit The construction is illustrated in Figure II.2. The front of
assumption in constructing such a diagram is that the loca-
tions of the atoms in three-dimensional space are known.
An atom is represented by a ball (a solid sphere) and a
molecule is the union of balls of its atoms. We study such
unions first in the plane and then in space.

Union of disks. Let be a finite set of disks in the Eu-

we denote as . We specify each

clidean plane, which

and its radius
disk
by its center
. An example is shown in Figure II.1. The union
Figure II.2: On the outside, the boundary of the union of uni-

formly grown disks, and on the inside the rounded boundary of
the original union.
the rolling circle describes the rounded boundary, which

consists of convex and reflex circular arcs. More formally,
this new curve is the boundary of the portion of
that
is not covered by any placement of the open disk bounded
by the rolling circle. We can imagine creating that portion
with a milling machine whose material removing stylus
Figure II.1: Union of disks in the plane. Four of the eight disks has the shape of the rolling circle.
contribute two arcs each to the boundary.
We note that the rounded boundary of is by and
of the disks, , has a boundary that consists of circular large tangent continuous but can have cusps at places
where the rolling circle cannot quite squeeze through two
arcs meeting at common vertices. It is also possible that
an arc is an entire circle, which has no endpoints. A single disks. There are no cusps in Figure II.2, but there would be

disk can contribute any non-negative number of arcs. if the two disks to the lower left were just a little smaller.
In cases where tangent continuity is important, we may

The total number of arcs is however rather limited. If there
turn the cusps into crossings by adding arcs connecting

are disks whose union is a simply connected region, as
in Figure II.1, then the number of arcs cannot exceed .
the cusps. We thus obtain a tangent continuous immersion
of a curve in .
arcs. Hints towards proving the up-

Even if we allow more general configurations, we cannot
get more than @
upper bound is a consequence of

per bound can be found among the exercises at the end
of this chapter. The Union of balls. Let now be a finite set of balls (solid
the relationship between arcs in the boundary of the union
spheres) in three-dimensional Euclidean space, which we
we
denote as . Similar to the two-dimensional case,
and angles in the Delaunay triangulation, which will be
explained in Section II.2.
specify each ball

by its center and
its radius . Figure II.3 shows the union of balls that
represent gramicidin, which is a small protein of barely
Rolling circle. We can make the boundary of the disk more than 300 atoms. To understand the structure of the
union smoother by substituting blending curves for the boundary of the union, , we study the portion con-

vertices where the circular arcs meet. To this end we tributed by a single sphere. The sphere bounding in-
roll a circle of radius on the outside about the bound- tersects the other balls in a finite collection of caps. The
ary. At any moment during the motion, the circle touches interior of each cap lies in the interior of the union, and
the boundary but never intersects the interior. The cen- the portion of the sphere not covered by any cap is the
II.1 Space-filling Diagrams 19
cally tight. However, the numbers for well packed sets of

spheres, which are common for proteins, are much smaller
and typically only a constant times .

Rolling sphere. We can again get a smoother bound-
ary by rolling a sphere of radius about
. The cen-
ter of that sphere moves along the boundary of the union
of grown balls, , and its front sweeps out blend-
ing surfaces that cover cusps and crevices of the origi-
nal boundary. Figure II.4 shows such a rounded surface
spheres in Figure II.3 have radii

representation of gramicidin. Relative to that surface, the
. There are convex
sphere patches that correspond to faces of
, reflex
torus patches that correspond to arcs of
, and reflex
sphere patches that correspond to vertices of
. The
Figure II.3: A union of balls representation of the gramicidin union of convex patches is sometimes referred to as the
contact surface because that is where the rolling sphere
protein.
touches . Similarly, the union of reflex patches (tori
and spheres) is referred to as the re-entrant surface. When
we look carefully, can can detect a self-intersection of the
contribution of the sphere to the boundary of the union. surface in Figure II.4. There is a hole whose rounded sur-
The caps form the same structure as the disks discussed face penetrates through the outer surface roughly in the

earlier, only that they live on a (two-dimensional) sphere middle of the picture. This happens because the tunnel
instead of . The structural description of a finite union connecting the hole to the outside is slightly too narrow
of balls is thus recursive in the dimension. The same type for the rolling sphere to squeeze through.
of symmetry can also be observed in dimensions beyond
three.
and vertices in the boundary of a

The number of arcs

union of balls in can be quite a bit higher than the
same numbers for a union of disks in . To count the

faces, arcs and vertices, we first note that a single sphere
intersects the other balls in fewer than caps. By analogy

to disks in the plane, the number of arcs in the bound-
ary of the union of caps is less than . Since each arc
has at most two endpoints (if it is a full circle then it has
vertices. To count the faces

no endpoints) and each endpoint belongs to two arcs, we
also have no more than
contributed by our sphere, we recall that these are the
connected components of the complement of the union of
caps. We will see that these components are related to the

triangles of the Delaunay triangulation, which implies that
Figure II.4: A molecular surface representation of the gramicidin
there are fewer than faces on this one sphere. To get protein.

bounds on the total number of faces, arcs and vertices, we
multiply by and note that each arc belongs to at least

two and each vertex belongs to at least three spheres. We In the application of space-filling diagrams to biology,
conclude that there are fewer than faces, fewer than
@
the radii of the balls are usually the van der Waals radii
arcs, and fewer than vertices. It can be shown

of the atoms, and the boundary of is referred to as

that for each value of , there are configurations of balls the van der Waals surface. The radius is chosen so that
with at least some constant times faces, edges and ver- the rolling sphere approximates a water molecule, and the
tices. This shows that the upper bounds are asymptoti- boundary of is referred to as the solvent accessible

surface. The rounded surface is usually referred to as the
star-shaped and that lies in its kernel. Since is the
molecular surface.
common intersection of the , , this im-
plies that
is also star-shaped and that lies also in its
kernel. It follows in particular that

is a connected cell.
Uniform growth. The boundary of and of do
not necessarily have the same combinatorial structure. We
Since the membranes bounding the
two-sheeted hyperboloids, the boundary of

are all sheets of
consists of
can understand structural changes by observing how they patches of such hyperboloids. All these patches are visible
are introduced while we continuously grow the balls. Each in their entirety if viewed from .
face of the boundary sweeps out a (three-dimensional) cell
We get the boundary of by drawing the sphere
in , each arc sweeps out a (two-dimensional) membrane
separating two cells, and each vertex sweeps out a curved
bounding each ball
only inside its own Voronoi cell,
edge in the common boundary of generically three mem- which is . By construction, the arcs of the patches meet
branes and three cells. up in pairs along the membranes and in triplets along the
We describe the same complex as a Voronoi diagram of

curved edges of the Voronoi diagram. The same is true for
and every . We can now see how structural differ-

*
the set of points with weights . Define the weighted
ences between and arise: when we grow the
* FH* F
distance of a point from equal to the Euclidean
balls, the boundary of the union sweeps out the Voronoi
distance minus the weight: . The

cell of is the set of points at least as close to as to any diagram, and we get a structural re-arrangement whenever
we sweep over a vertex of the Voronoi diagram.
other weighted point,
* * *

Bibliographic notes. Space-filling diagrams have a long
tradition in biochemistry and are similar to the CPK me-
Figure II.5 illustrates the definition in two dimensions.
Consider the case of two weighted points, and , and chanical models named after Corey, Pauling and Koltun
[5, chapter 1]. The variations of these models discussed
in this section have been introduced by Lee and Richards
[6, 7]. The molecular surface is sometimes referred to as
the Connolly surface, named after Michael Connolly who
wrote early software constructing this surface [3]. The sol-
vent accessible surface in Figure II.3 and the molecular
surface in Figure II.4 are computed using the software de-
scribed in [1].
Increasing all radii of a set of circles or spheres contin-
uously and at the same rate is referred to as the Johnson-
Mehl model of growth [4]. It leads to the Voronoi diagram
of this section, which is sometimes referred to as the addi-
tively weighted Voronoi diagram. We refer to Aurenham-
Figure II.5: Two-dimensional Voronoi diagram generated by uni- mer [2] for a survey of Voronoi diagrams, their algorithms
formly growing the disks. and applications. An algorithm that computes cells of the
let
!
be the set of points with . If one * *
additively weighted Voronoi diagram in
veloped and implemented by Will [8].
has been de-
ball is contained in the interior of the other then its cell
*
is empty. Otherwise, we have two non-empty cells sep-
[1] N. A KKIRAJU , H. E DELSBRUNNER , P. F U AND J. Q IAN .
arated by a two-dimensional membrane. The points of
Viewing geometric protein structures from inside a CAVE.
this membrane satisfy

IEEE Comput. Graphics Appl. 16 (1996), 58–61.
F* F F*
F [2] F. AURENHAMMER . Voronoi diagrams — a study of a fun-

damental geometric data structure. ACM Comput. Surveys
perboloid. Observe that for every point , the line

which is the equation of one sheet of a two-sheeted hy- 23 (1991), 345–405.
segment connecting and lies entirely in . In ge-

ometry, this property is expressed by saying that is
[3] M. L. C ONNOLLY. Analytic molecular surface calculation.
J. Appl. Crystallogr. 6 (1983), 548–558.
II.1 Space-filling Diagrams 21
[4] W. A. J OHNSON AND R. F. M EHL . Reaction kinetics in

processes of nucleation and growth. Trans. Am. Inst. Mining
Metall. AIMME 135 (1939), 416–458.
[5] A. R. L EACH . Molecular Modeling. Principles and Appli-

cations. Longman, Harlow, England, 1996.
[6] B. L EE AND F. M. R ICHARDS . The interpretation of pro-

tein structures: estimation of static accessibility. J. Mol.
Biol. 55 (1971), 379–400.
[7] F. M. R ICHARDS . Areas, volumes, packing and protein

structures. Ann. Rev. Biophys. Bioeng. 6 (1977), 151–176.
[8] H.-M. W ILL . Computation of Additively Weighted Voronoi

Cells for Applications in Molecular Biology. Diss. ETH
13188, ETH Zürich, Switzerland, 1999.
II.2 Power Diagrams FH* F . We have

* *
inside

If we grow the square radii of a finite collection of spheres
if lies on boundary of
or balls, we get a decomposition of space into convex
outside
polyhedra. This decomposition is known as the power di-
agram and has a variety of applications in molecular mod- * *
*
If lies outside , the power distance of is the square
eling. length of a tangent line segment from to the bounding
sphere. Using the same algebraic manipulations as above,
we can show that the set of points with equal power dis-

Growing square radii. As in Section II.1, we let be

a finite set of balls

. The square of the radius, tance from two balls form a plane. The two planes are
, is sometimes referred to as the weight of the point .

, indeed the same. As indicated in Figure II.6, this plane

1 , may separate the two bounding spheres, intersect both, or
,
We grow each ball to radius at time . The lie on the same side of both. Think of the three configura-
set of balls at time is denoted as . The Taylor series
expansion of the radius as a function of time is

, , ,

The first order approximation of the growth is one half the

inverse of the radius. Hence, larger balls grow slower than Figure II.6: The line of equal power distance separates if the
smaller ones. Of course, smaller balls never really catch two circles are disjoint and not nested, it passes through their
up except in the limit: intersection if that is non-empty, and it passes outside if the two
circles are nested.
1

,, tions as snap-shots in an animation in which the center of
the small circle moves towards the center of the large cir-
cle. At first, the line moves in the same direction but then
We are interested in the surface swept out by the intersec-
comes to a halt and reverses its direction moving away
* ,
tion of the spheres bounding and and claim it is a
F , ,
from the center of the large circle.
F* FH* F
plane. The points that belong to both spheres at time

.
,
satisfy
Varying has the same effect as dropping the requirement Power diagram. The power or (weighted) Voronoi cell
that the two expressions vanish. Instead we just require
of a ball under the power distance is the set of points at
that they both be equal, so we get least as close to as to any other ball,
FH* F , HF *
F ,
* * *

*
:* *

:*

*
* F F F F
If we denote by the set of points whose power dis-

!
tance from is at most as large as the power distance from

then . In words, is the intersection of
We see the circle at which the two spheres intersect sweeps a finite number of half-spaces and thus a convex polyhe-
1
out a plane. If follows that the membranes swept out by dron. This polyhedron may be bounded or unbounded,
the arcs of are pieces of planes. and it is even possible that it is empty. The power or
(weighted) Voronoi diagram of is the collection of cells

together with the polygons, edges, and vertices shared
Power distance. We can describe the decomposition of by the cells. Every polygon is shared by two cells, and in
space implied by the square radius growth model as a the generic case every edge is shared by exactly three and
Voronoi diagram for yet another weighted distance func- every vertex is shared by exactly four cells. Figure II.7 il-
*
tion. The appropriate function in this case is the power lustrates the definitions in two dimensions by showing the

*
distance of a point from a ball defined as the square Voronoi diagram of the same eight disks used in earlier

distance from the center minus the weight, figures.
II.2 Power Diagrams 23
triangles, and vertices become tetrahedra. Similarly, we

reverse the inclusion direction. For example, a Voronoi
polygon belongs to a Voronoi cell iff the corresponding
Delaunay edge contains the corresponding Delaunay ver-
tex.
Number of simplices. We refer to an element of a De-

launay triangulation as a simplex, which can be a vertex,
an edge, a triangle or a tetrahedron. We can count the sim-
plices using the Euler relation, which says that the alter-

nating sum of simplices is always equal to 1. Writing ,

, and for the numbers of vertices, edges, triangles
Figure II.7: Power or weighted Voronoi diagram of eight disks
and tetrahedra, we have

in the plane.

Delaunay triangulation. The (weighted) Delaunay tri-
Before counting the simplices in three dimensions, let us
angulation of is dual to the (weighted) Voronoi dia-

by connecting and by an edge
warm up to the challenge by counting the simplices of a

gram. It is obtained

if the cells and share a common polygon. Similarly,
two-dimensional Delaunay triangulation. The Euler rela-

tion here is . Observe that every triangle
, and are connected by a triangle if , and

share a common edge, and , , and are connected
@
has three edges and every edge belongs to at most two tri-
by a tetrahedron if , ,

and share a common ver-
angles, hence
the Euler relation implies

and

. Combining this inequality with
. The
tex. Assuming the balls in are in general position, this
exhausts all possible types of overlap among the Voronoi
number

of vertices
@
is
, and .
at most

the number of
@
disks, hence
cells. Since complexes of tetrahedra are difficult to draw,
we illustrate the definitions by showing a two-dimensional In three dimensions, we note that each tetrahedron has
Delaunay triangulation in Figure II.8. If the balls are not @
four triangles and each triangle belongs to at most two
in general position, we can perturb them ever so slightly tetrahedra, hence
=
. Combining this with

the Eu-

to move them into general position. ler relation implies and
number of vertices is at most the number of balls,

. The
,

and the number
of
edges is at
most the number of pairs of
vertices, . Hence

There are Delaunay triangulations that have almost this

many simplices, but they require a placement of the balls
that would be rather unlike the configurations we observe
for proteins. Typically, each atom is surrounded by its
Figure II.8: Delaunay triangulation drawn over the dual Voronoi neighbors in the Delaunay triangulation. The neighbors
diagram of eight disks in the plane. The Delaunay triangles are are near the central atom and are therefore packed in a
transparent so they do not obstruct the structure of the Voronoi small amount of space, implying there can only be a small
diagram underneath. constant number of them. It follows that the number of

edges in the Delaunay triangulation is at most some con-
Observe that we reverse dimensions when we go from stant times , and as a consequence, also the number of

the Voronoi diagram to the Delaunay triangulation: cells triangles and tetrahedra are at most some constant times
become vertices, polygons become edges, edges become .
Orthospheres. Suppose for a moment that the balls

that does not intersect any edge of the Delaunay triangu-

all have zero radius. Then each Voronoi vertex is equally lation. The half-line passes through a sequence of Delau-

far from four points and coincides with the center of the nay tetrahedra, , and we have and

circumsphere of these points. We will use the concept of for some . Any two consecutive tetrahedra
orthogonality to generalize this property to the case where

share a triangle. It follows that the orthospheres

the have not necessarily zero and not necessarily equal of and of are orthogonal to the three balls
radii. Two spheres or balls

and

whose centers span that triangle. The plane of points with

are orthogonal if equal power distance from and thus contains the

F F
shared triangle. The viewpoint is on ’s side of that

plane, which implies that the power distance of from

is less than that from . By transitivity, the power dis-

The name is justified because the two tangent planes de-
tance of from the orthosphere of is less than its power

fined at any point common to the bounding spheres of
distance from the orthosphere of , whenever , and

and form a right angle between them. the same is true for and . In other words, the power
* distance increases along chains of the relation . Since

*
Let now be a vertex of the Voronoi diagram of .
Assuming the generic case, has equal power distance real numbers are totally ordered, we conclude that is
acyclic.
* * *
from four balls, , , and , and larger power distance

* * *
from all others. Let be the sphere with center

and weight .

Algebraically, there is no difficulty at all if is negative
and is therefore imaginary. That sphere is orthogonal to
Bibliographic notes. Power diagrams of discrete sets of
weighted points have been studied by Carl Friedrich Gauss

, ,
and , and we refer to it as the orthosphere of more than 150 years ago in the context of quadratic forms
the four balls. If the four balls had zero radius, would be [6]. In reference to subsequent work by Dirichlet [3] and

F
Voronoi [8], these diagram are often referred to a Dirichlet
F*
their circumsphere. Note that is further than orthogonal
:
from all other balls, that is,

for all
. This property can be used to characterize
tessellations or Voronoi diagrams. The dual triangulations
have been introduced considerably later by Boris Delau-
Delaunay tetrahedra for a generic set of balls. Specifically, nay (also Delone) [2]. It is common to reserve the name
a tetrahedron connecting points , , and belongs Delaunay triangulation for unweighted points and to refer
to the duals of power diagrams as regular triangulations [1]
to the Delaunay triangulation of iff the orthosphere of

, ,
and is further than orthogonal from all other or coherent triangulations [7]. We prefer to be economi-
cal with terms and refer to them as (weighted) Delaunay
balls in .

triangulations. Algorithms for constructing weighted De-
launay triangulations in and are discussed in [4,
Acyclicity. Given a fixed viewpoint, we can order two Chapters I and V]. That reference also explains how to
tetrahedra if one lies in front of the other one, as seen from computationally cope with ambiguities in the construction
the viewpoint. We call this the visibility ordering with re- caused by non-generic input sets. Upper bounds on the
spect to the given viewpoint. It turns out that this relation
number of Delaunay simplices for “well-spaced” points in

can in general have cycles but is acyclic for Delaunay tri- can be found in [5].

angulations. We need some notation. Let be the

viewpoint and write if there is a half-line that em- [1] L. J. B ILLERA AND B. S TURMFELS . Fiber polytopes. Ann.

anates from and passes through the interior of the De- Math. 135 (1992), 527–549.

launay tetrahedron before it passes through the interior

the Delaunay tetrahedron . We use orthospheres to prove [2] B. D ELAUNAY. Sur la sphère vide. Izv. Akad. Nauk SSSR,
that the relation is acyclic. Otdelenie Matematicheskii i Estestvennyka Nauk 7 (1934),
793–800.
ACYCLICITY L EMMA . The visibility ordering of the De-
[3] P. G. L. D IRICHLET. Über die Reduktion der positiven
launay tetrahedra with respect to any fixed viewpoint
quadratischen Formen mit drei unbestimmten ganzen Zahl-
is acyclic. en. J. Reine Angew. Math. 40 (1850), 209–227.

P ROOF. Let be a half-line that emanates from and [4] H. E DELSBRUNNER Geometry and Topology for Mesh
passes through the interiors of and . We may assume Generation. Cambridge Univ. Press, England, 2001.
II.2 Power Diagrams 25
[5] J. E RICKSON . Dense point sets have sparse Delaunay tri-

angulations. In “Proc. 13th Ann. ACM-SIAM Sympos. Dis-
crete Alg., 2002”, 125–134.
[6] C. F. G AUSS . Recursion der Untersuchungen über die

Eigenschaften der positiven ternären quadratischen Formen
von Ludwig August Seeber. J. Reine Angew. Math. 20
(1840), 312–320.
[7] I. M. G ELFAND , M. M. K APRANOV AND A. V. Z ELE -

VINSKY. Discriminants, Resultants and Multidimensional
Determinants. Birkhäuser, Boston, 1994.
[8] G. VORONOI . Nouvelles applications des paramètres con-

tinus à la théorie des formes quadratiques. J. Reine Angew.
Math. 133 (1907), 97–178, and 134 (1908), 198–287.
II.3 Alpha Shapes

Independence. Recall that a simplex belongs to the dual
complex iff the corresponding clipped balls (the )
have a non-empty common intersection. This condition
Recall that the Delaunay triangulation is the dual of the
Voronoi diagram. In this section, we generalize this con- has an interesting consequence on how the themselves
struction and consider the dual of the Voronoi diagram re- may intersect. In a nut-shell, there can be at most four
stricted to within the union of the defining balls. balls (one more than the dimension of the space), and
they can form only one combinatorially distinct intersec-

tion pattern. We first discuss this pattern for general sets
Dual complex. Observe that the Voronoi cells decom-

that are not necessarily balls. Call of a collection of sets

pose the union of balls in into convex cells

independent if for every subcollection there is a
. Let be a subset of the index set. point inside every set in and outside every set not in :
The dual complex records the non-empty common inter-
sections among these cells,

has

A collection of size subcollections. For this

where is the convex hull of the centers of the balls with
collection to be independent, there must be points
index in . Equivalently,
iff the common inter- whose patterns of inclusion in the sets are pairwise

different. We use the pigeonhole principle to show that
section of Voronoi cells has a non-empty
the union of balls:
intersection with
. Note that this
the maximum number of independent disks in the plane is

three. Let be the maximum number of regions
is just a more formal way of explaining the duality trans-

we can

get by
drawing

circles in the plane. We have

formation we used in the last section to construct the De- and because the -st

launay triangulation from the Voronoi diagram. The un- circle intersects the other circles in at most two points

derlying space is the set of points contained in simplices
of . In this context, we refer to it as the dual shape of
each. These points cut the

-st circle into at most
arcs, and each arc cuts at most one region into two. The
. Figure II.9 illustrates the definition for the set of disks number of regions is therefore
#

used in many of the previous figures.

%

Hence, , which implies that at most three
disks can be independent. For each there
is a (combinatorially) unique independent configuration
shown in Figure II.10. The same argument also works
Figure II.9: The dual complex is drawn on top of the Voronoi

decomposition of the union of disks. The nine edges correspond
to the pairwise intersections and the two triangles to the triple-
wise intersections of the clipped Voronoi cells.
Figure II.10: The independent configurations of one, two, and
In the special case, in which the balls have non-empty
pairwise but no non-empty triple-wise intersections,
three disks in the plane.
looks like the ball-and-stick diagram common in chem- in three dimensions, where it can be used to show that
istry and biology. There, each stick represents a covalent the maximum number of independent balls is four. Again,
bond, while here, it represents the geometric overlap be- there is only one possible intersection pattern for four in-
tween two balls. dependent balls.
II.3 Alpha Shapes 27

Independent simplices. Recall that each simplex in the independent caps. But this implies that the Voronoi vertex
Delaunay triangulation is spanned by the centers of a small lies outside the sphere: .
collection of balls, four for a tetrahedron, three for a trian-
As mentioned above, the Independence Lemma also
gle, and so on. In discussions of combinatorial properties, holds for three disks in the plane. Given three balls, we get
we sometimes forget the difference and think of the sim-
three disks of maximum size by intersecting them with the
plex as this collection of balls. In this spirit, we call the plane that passes through the centers. This plane intersects
simplex independent if the collection of balls is indepen- the Voronoi diagram of the balls in the Voronoi diagram of
dent. We will prove shortly that all simplices in the dual
the disks. But this implies that three balls are independent
complex are independent. This is a fairly strong statement iff the (unique) line in the corresponding Voronoi diagram
since it limits the balls to a single intersection pattern. The
has a non-empty intersection with the union of the three
following lemma is the key to proving that all simplices in
balls. Similarly, two balls are independent iff the (unique)
the dual complex are independent. The lemma holds in plane in the corresponding Voronoi diagram has a non-
any dimension, and can be proved by induction over the
empty intersection with their union. But this is exactly the
dimension. To avoid the complications of a discussion for
general dimensions, we assume the lemma for disks (or

criterion for a simplex to belong to the dual complex. It
follows that each simplex in is independent, as claimed.
rather for caps on a sphere) and prove it for balls in .

I NDEPENDENCE L EMMA . A collection of four balls in
Filtration. We return to the idea of growing the balls
is independent iff the (unique) vertex of the

corresponding Voronoi diagram is contained in the
continuously and watch how the union changes. We let
,
time go from to and grow the weight of each ball
, ,
union: .

to at time . Each has zero weight at time
P ROOF. Assume first that

, for example
. There sphere bounding intersects the other balls

and negative weight and therefore imaginary radius
before that time. By construction, the Voronoi cells of the
in three caps. The circles bounding these caps lie in the balls are unchanged at all times. It follows that the dual

three planes bounding the Voronoi cell of , and because complexes that arise throughout time are subcomplexes of
lies outside , the three caps are not independent. A one and the same Delaunay triangulation. Furthermore,
So there exists a subset

particular such configuration is illustrated in Figure II.11.

not represented by
since the portions of the Voronoi cells covered by the balls
can only grow, the dual complexes can also only get larger
in time.
Instead of time, we use the square root, , as the ,
vention is that for

index for time varying sets. The main reason for this con-
, the radius of the ball at time ,

is . We need some notation. Let

be the collection of
,

u balls and the dual complex of

at time . We

refer to as the -complex and to its underlying space

as the -shape of . For small enough (large enough neg-

ative) time, all radii are imaginary, , and the

dual complex is empty. For large enough time, cov-
ers all Voronoi vertices, and the dual complex is equal to

the Delaunay triangulation. We thus have a sequence of
complexes that begins with the empty complex and ends

Figure II.11: The planes bounding the Voronoi cell intersect the

sphere in three circles. The three planes meet at , and because
with the Delaunay triangulation,
for every

. There are only
,

lies outside the sphere, the three caps are not independent.

finitely many simplices and therefore only finitely many
any point on the sphere, that is,
subcomplexes of that arise as dual complexes during the

. It can still

be that there is a point outside contained in
growth process. We refer to this sequence as a filtration of
, but then

. In other words, is not independent.
the Delaunay triangulation,
Figure II.12 illustrates the construction by showing three
.
To prove the reverse, we assume that is not indepen- complexes in the filtration generated by eight disks in the
dent. Then intersects the other three balls in three non- plane. To translate between continuous time and discrete
the shared Voronoi vertex. This is also the time when

the three disks become independent, but the pair of larger
disks became independent earlier.
Figure II.13: The two larger disks are independent, but the dual
Figure II.12: Three unions of disks and the corresponding dual edge does not belong to the dual complex because their common
complexes. The first complex contains all vertices but only two intersection is disjoint from the corresponding Voronoi edge.
edges and no triangles. From the first to the third complex, the
edges become thinner and the triangles become lighter.
We represent the filtration by sorting the Delaunay sim-
plices by birth-time, and in case of a tie by dimension.

rank, we define a function

.
such that
if
plex

Remaining ties are broken arbitrarily. Every dual com-
is a prefix of this ordering, and because of the tie
breaking rule, every prefix is a complex, even if it does
not coincide with a dual complex. This property of the or-
Ordering simplices. We can sort the Delaunay sim-

dering will be crucial for the algorithm in Chapter IV that
plices in the order in which they enter the dual complex.

computes the connectivity of the .
, ,
Define the birth-time of a simplex as the minimum
time such that for all . The differ-

ence between two contiguous complexes in the filtration Bibliographic notes. Alpha shapes and alpha com-
consists of all simplices whose birth-time coincides with plexes have been introduced by Edelsbrunner, Kirkpatrick

the creation of the second complex, and Seidel [3] in 1983 for finite sets of points in the plane.

About a decade later, the concept has been generalized to

Often two contiguous complexes and differ by
three dimensions and made available as a software pack-
age with graphical user interface [4]. The unexpected
only one simplex, . In this case, the birth-time of coin-

popularity of that software in structural biology triggered
the development of further geometric concepts useful in
sphere of be the smallest sphere orthogonal to all balls
cides with the time it becomes independent. Let the ortho-
structural biology, some of which are explained in this
whose centers are vertices of . The time becomes in-
book. The main reason for the popularity is the duality
dependent is also the time the orthosphere of dies or
between space-filling diagrams and alpha shapes as ex-
plained in this and the two preceding sections. To fully
shrinks to a point. Geometrically, this case is characterized
develop that duality, alpha shapes had to be extended to

by a non-empty common intersection between the affine

take into account weights, and this has been described in
hull of and the Voronoi cells of its vertices. Sometimes,
complete generality in [2]. That generalization benefit-
however, the difference between and consists of
ted from adopting the language of simplicial complexes,

two or more simplices. In the generic case, all these sim-
which has been developed decades earlier in the area of
plices are faces of a single simplex, , that also belongs
,
combinatorial topology [1, 5].
to the difference. All these simplices are born at the same
time, . In the absence of any degeneracy,
[1] P. S. A LEXANDROV. Combinatorial Topology. Dover, New
,
their orthospheres die at different times, with the ortho-
York, 1998 (republication of translation of the original Rus-
sphere of dying last at time . Figure II.13 illustrates
sian edition from 1947).
this case. The triangle connecting all three centers and
the edge connecting the centers of the two larger disks are [2] H. E DELSBRUNNER . The union of balls and its dual shape.
born at the same time, namely when all three disks reach Discrete Comput. Geom. 13 (1995), 415–440.
II.3 Alpha Shapes 29
[3] H. E DELSBRUNNER , D. G. K IRKPATRICK AND R. S EI -

DEL . On the shape of a set of points in the plane. IEEE
Trans. Inform. Theory IT-29 (1983), 551–559.
[4] H. E DELSBRUNNER AND E. P. M ÜCKE . Three-dimen-

sional alpha shapes. ACM Trans. Graphics 13 (1994), 43–
72.
[5] P. J. G IBLIN . Graphs, Surfaces and Homology. Second edi-

tion, Chapman and Hall, London, 1981.
II.4 Alpha Shape Software tains a line for each atom listing its three coordinates and
the van der Waals radius. The -r option allows for the
This section introduces the basic Alpha Shape software specification of a radius increment that is applied to every
and explains how to go from a standard descriptions of atom in the file. In our example, this radius increment is
protein structures to the visualization of their alpha shapes. 1.4 Å, which is the most common approximation used for
The discussion is more descriptive and less analytical the size of water molecules. The resulting set of balls thus
than in the previous three sections. Given a pdb-file, defines the solvent accessible diagram representing the in-
name.pdb, we take four steps to construct and visualize teraction with the surrounding water; see Section II.1.
alpha shapes in an interactive graphical user interface:
Delaunay triangulation. The first step towards comput-
> pdb2alf name.pdb name
ing alpha shapes is to construct the Delaunay triangulation
> delcx name
of the set of balls. This is accomplished by the command
> mkalf name
> alvis name
> delcx name
The details of the discussion apply to Version 4.1 of the
The aunay omple program creates a file name.dt

Alpha Shape software executed on an SGI workstation
running under the UNIX operating system and may differ that represents the Delaunay triangulation. The efficient
for other versions and platforms.
and robust construction of the Delaunay triangulation in
is not entirely straightforward. We briefly mention the
algorithmic ingredients used. The basic strategy is incre-
Data format. The main public source for structural pro-

mental, adding one ball at a time to the triangulation. Us-
tein data is the Protein Data Bank (pbd) mentioned in Sec-

ing an arbitrary ordering of the balls, we write for the
tion I.3. Only a fraction of the information is needed to
construct alpha shapes. Specifically, for each atom we

of , for

set of the first balls and
for the Delaunay triangulation
. With this notation, the algorithm
only need its coordinates in three-dimensional space and can be written as follows.
;
its radius. The coordinates are explicitely given in the file,
but the radius must be inferred from the atom type. This
is done according to published translation tables that map
for
to do
I NSERT
endfor.

atoms to van der Waals radii. Unfortunately, there is no
universally agreed upon table. Some differences are due
to different methods used to derive radii, including mea- The -th ball is inserted through a sequence of flip opera-
surements of closest approach, molecular mechanics cal- tions. The flips are performed depending on the outcomes
culations, etc. One of the most problematic elements is of only two types of primitive tests needed in the construc-
hydrogen (H), which accounts for almost 50% of the num- tion of the Delaunay triangulation:
ber of atoms found in organic matter. Hydrogen atoms
sometimes donate their electrons to complete the shells of O RTHOGONALITY: decide whether a ball is closer or fur-
other atoms and thus can exist without any shell and rather than orthogonal to the orthosphere of four other
dius to speak of. Hydrogen atoms are generally not repre- balls.
sented in pdb-files, but can be inferred to some accuracy O RIENTATION : decide whether a ball center is on the pos-
from the types and relative positions of the other atoms in itive or negative side of the oriented plane spanned by
the protein. In the common unified atom model, the van three other ball centers.
der Waals radii of larger atoms are adjusted to include the
bonded hydrogen atoms. Both tests reduce to the sign of the determinant of a small
We can extract the coordinates and the radii using soft- matrix and can be decided without computing intermedi-
ware that is part of the Alpha Shapes distribution. Specif- ate geometric information. The operations are ambiguous
ically, we call if the balls are in non-generic position, and so is the De-
launay triangulation. To cope with the related robustness
> pdb2alf -r 1.4 name.pdb name problem, we use exact arithmetic and simulated perturba-
tion. Exact arithmetic guarantees the correct execution of
to read name.pdb and create a new file name that con- flips in all generic and therefore unambiguous cases, and
II.4 Alpha Shape Software 31
simulated perturbation reduces ambiguous cases in a con- > mkalf name

sistent manner to unambiguous ones. The use of exact
rather than floating-point arithmetic poses a challenge to
The a e pha shape iltration program reads the Delau-
the efficiency of the code. A common remedy is to use nay triangulation in name.dt and generates a new file,
so-called floating-point filters: calculate in floating-point name.alf, that stores the filtration along with some aux-
arithmetic, bound the error, and redo the computation in iliary data structures.

exact arithmetic if the error is too large to guarantee a cor- The software refers to the sorted sequence of simplices
rect decision.

as the ‘masterlist’. It stores each simplex several

Another challenge to the efficiency of the code is the times, marking when is born, when becomes a face

inherent size of the Delaunay triangulation. As mentioned of another simplex, and when becomes interior to the
, , ,
in Section II.2, the Delaunay triangulation in can have alpha

complex. Suppose the three events happen at times

a number of simplices that is quadratic in . For exam- . Then

ple, if the centers of the balls lie on the moment curve
,
,

,
not in if

and all radii are equal, then every pair of vertices forms

, ,
singular if
an edge in the Delaunay triangulation, as shown in Fig- is
ure II.14. Fortunately, the balls of organic molecules are regular
interior
if
if

,

The combinatorial topology term for being singular is

principal and means that is not a face of any other sim-
plex. The simplex is regular if it belongs to the bound-
ary but is not principal, and it is interior if it is completely
surrounded by other simplices. Some of the three events
, , ,
may coincide. For example,

a tetrahedron is interior as
soon as it is born, so
boundary of

. A simplex in the
can never become interior, so . ,

, ,
Finally, a simplex whose orthosphere dies strictly
before
the simplex is born is never singular, so . The

main reason for recording all this information is to deter-
Figure II.14: Edge-skeleton of the Delaunay triangulation of
twenty one points on the moment curve in .
mine how to draw in the graphical interface, but there
are others. Figure II.15 shows four alpha complexes of the
relatively small gramicidin protein. In each case, we only

usually well packed and have Delaunay triangulations of
show the singular simplices together with the regular tri-
size at most proportional to . The danger remains that
one of the intermediate triangulations is large. Then we
angles. Given a value of , we need quick access to the
simplices of the various types in . For this purpose, we

spend a lot of time constructing that triangulation, only to

store the existence intervals in a number of intervals trees.
destroy most of it before arriving at the final triangulation.
Each such tree stores some number of intervals in space
,

This danger is quite real as systematic enumerations of
O( ), and for a given moment , it enumerates the
the data tend to generate subconfigurations with relatively
large Delaunay triangulations. The remedy here is to add
simplices whose intervals contain in time O(

). ,
the balls in a random sequence. In other words, we apply
a random permutation to the input sequence and construct Visualization. We finally discuss the visualization inter-
the Delaunay triangulation following this permutation. face of the Alpha Shapes software. The necessary support
structures are computed and the graphics user interface is
opened by executing
Filtration. As explained in Section II.3, dual complexes
obtained by growing the square radii form a nested se- > alvis name

quence of subcomplexes of the Delaunay triangulation,
. This is the filtration

The pha shape ualization program uses both the De-
of -complexes, for . We represent the launay triangulation file, name.dt, and the filtration file,
filtration by the sequence of Delaunay simplices ordered name.alf. The interface consists of a visualization
by birth-time. The sequence is generated by calling panel, and scene panel, and a signature panel. All alpha
Figure II.16: Signature panel of the Alpha Shape visualizer.
Figure II.15: Four alpha complexes of gramicidin.
complexes are shown in the first but which complex is
panels. The visualized complex

shown and how it is shown is decided in the other two
is selected in the sig- Figure II.17: Scene panel of the Alpha Shape visualizer.
nature panel. To support that selection, the panel displays
a variety of functions (or signatures) that illustrate how

the complexes change with time. For example, the three
default signatures map each index to the number of sin-
1-skeleton of the Delaunay triangulation shown in Figure
II.14 is obtained by drawing all edges of the last alpha
the underlying space of

gular edges, the area of the boundary, and the volume of
. Figure II.16 shows the signa-
complex while suppressing the display of all triangles and
tetrahedra.
ture panel and the three default signatures for gramicidin.
, ,
All signatures that count rather than measure are displayed

in log-scale. Instead of mapping the time to a property of

Bibliographic notes. The Alpha Shape software was
of

, the signatures map the index

to the property
. To facilitate the reconstruction of the map
created by Ernst Mücke as part of his doctoral work at
Urbana-Champaign. The best documentation of the algo-
from time, the panel contains a signature that maps the in- rithm and data structures used in the software are still his

dex to time. Specifically, it shows the log-scale graph of
. A particular index, , is selected by the position of a
thesis [6] and the original paper on the topic [4]. After a
period of rapid development directed by Ping Fu at the Na-
tional Center for Supercomputing Applications, the soft-
vertical bar in the signature panel and by clicking the Al-
ware reached version 4.1 in 1996, which is still the most
pha Shape button in the scene panel, as shown in Figure
II.17. The buttons in the middle of the scene panel provide recent version distributed on the web [7]. The Delaunay
triangulation software in the Alpha Shapes distribution is
control over how simplices are drawn: colored, shaded, in
wireframe, seamless, or with gaps created through a slow based on a variety of algorithmic techniques described in
explosion. The matrix on the right hand side can be used a recent text by Edelsbrunner [3]. The interval tree used
for fast retrieval of simplices is explained in [2].
to select the types of displayed simplices. By default, only
the singular vertices, edges, triangles and the regular trian- As mentioned earlier, the largest resource for structural
gles are shown. Different settings can be used to highlight protein data is the Protein Data Bank [1], which can be
different aspects of an alpha complex. For example, the accessed via the web [8]. A survey of geometric measure-
II.4 Alpha Shape Software 33
ments of proteins including a discussion of different tables

for van der Waals radius assignment can be found in [5].
[1] H. M. B ERMAN , J. W ESTBROOK , Z. F ENG , G. G ILLI -

LAND , T. N. B HAT, H. W EISSIG , I. N. S HINDYALOV AND
P. E. B OURNE . The Protein Data Bank. Nucleic Acids Res.
28 (2000), 235–242.
[2] H. E DELSBRUNNER . A new approach to rectangle intersec-

tions – part I. Internat. J. Comput. Math. 13 (1983), 209–
219.
[3] H. E DELSBRUNNER . Geometry and Topology for Mesh

Generation. Cambridge Univ. Press, England, 2001.
[4] H. E DELSBRUNNER AND E. P. M ÜCKE . Three-dimension-

al alpha shapes. ACM Trans. Graphics 13 (1994), 43–72.
[5] M. G ERSTEIN AND F. M. R ICHARDS . Protein geometry:

distances, areas, and volumes. Chapter 22 in The Interna-
tional Tables for Crystallography, Vol. F, M. G. Rossmann
and E. Arnold (eds.), Kluwer, Dordrecht, the Netherlands,
2001, 531–539.
[6] E. P. M ÜCKE . Shapes and Implementations in Three-

dimensional Geometry. Rept. UIUCDCS-R-93-1836, Dept.
Comput. Sci., Univ. Illinois, Urbana, 1993.
[7] Alpha Shapes web-site at www.alpha-

shapes.org; see also the software collection in
biogeometry.duke.edu.
[8] Protein Data Bank web-site at www.rcsb.org/pdb.


.

.
Exercises
(ii) Show that
(i) Show that

1. Tree-like sequences. Given an alphabet of

letters, form a sequence but refrain from placing any . The
[We note that the relation in (ii) neatly generalizes
4
letter twice in a row. The sequence is tree-like if the formula
there are no two letters that alternate more generalization is not quite as neat if we sum powers
rather than binomial coefficients.]
$4 4 4 4
than twice. In other words, subsequences of the form
4 H4
and are prohibited. Examples of

5. Sphere arrangements. Let

be the maximum
4 H4 4 4
tree-like sequences of four letters are and number of cells we get by drawing spheres in .
.

(i) Show that unless .

of letters has length at most @
(i) Prove that a tree-like sequence over an alphabet

. Is this
(ii) Give a formula for that works for all posi-
bound tight? tive .
(ii) Define a tree-like cyclic sequence by pro- [You might consider answering question (ii) before
4 4
hibiting cyclic subsequences of the form question (i).]
. Prove that a tree-like cyclic se-

6. Independent half-spaces. A half-plane is the set of
@
quence over an alphabet of letters has length
at most . Is this bound tight? points on or on one side of a line in . Similarly, a
2. Number of arcs. Let be a set of disks in

half-space is the set of points on or on one side of a
plane in , and a cap is the intersection of a sphere

the plane. The boundary of the union of the disks with a half-space. What is the maximum number of
consists of circular arcs contributed by the circles. independent
(i) Assuming the boundary of is a single (i) half-planes in ,

@
closed curve, use tree-like cyclic sequences to
(ii) half-spaces in ,
prove that it consists of at most (maxi-
mal) circular arcs. Is this bound tight? (iii) caps on a sphere in ?
(ii) Prove that in general the number of (maximal) 7. The filtration of water. A water molecule consists
most
. Is this bound tight?

circular arcs in the boundary of the union is at of one oxygen and two hydrogens: H O.

3. Empty Voronoi cell. Call a disk in a finite collec- (i) Look up the standard geometric model (deter-
mined by radii, bond length and bond angle).
tion of disks redundant if its Voronoi cell is empty. (ii) Describe the Voronoi diagram and the sequence
(i) Prove that if there are disks , and in the of alpha complexes of the model.
collection such that
* * * *
8. Barycentric subdivision. The barycentric subdi-
(a)
*
for the vision of a simplex is obtained by adding the
orthocenter of , and

barycenter of (also known as the centroid or cen-
(b) lies in the triangle ter of mass) as a new vertex and connecting it to the

then is redundant. simplices in the barycentric subdivisions of the faces.
(ii) Prove that the necessary conditions given in (i)
(i) How many vertices, edges, triangles and tetra-
are also sufficient. In other words, prove that if

is redundant then there exist disks , and hedra are in the barycentric subdivision of a

that satisfy Conditions (a) and (b).
tetrahedron?
4. Binomial coefficients. Let be two positive

(ii) Use the Alpha Shape software to create the
barycentric subdivision of a regular tetrahe-
integers and recall that the binomial coefficient is
dron.

the number of ways we can choose elements from
# [You will need to use weights to make the barycentric

a collection of elements. Recall also that

subdivision of the tetrahedron the Delaunay triangu-
%

lation of the points.]

Chapter III
Surface Meshing
Recall the different types of space-filling diagrams we we use that software to illustrate some of the properties of
discussed in Chapter II. The van der Waals and the solvent these curves and surfaces.
accessible models are both unions of finitely many balls
in three-dimensional space and differ only in the radii. We
have also discussed the molecular surface model that is ob-
tained by rolling a sphere about the van der Waals model. III.1 Molecular Skin
Corners and crevices are filled up and the surface consists III.2 Curvature
of spheres connected by blending torus patches and in- III.3 Adaptive Meshing
verted sphere patches. III.4 Skin Software
Exercises
In this chapter, we introduce model that is similar to
the molecular surface. Its surface consists of spheres
connected by blending hyperboloid patches and inverted
sphere patches. We call this the molecular skin model.
The surface is piecewise quadratic and has a number of
attractive properties not shared by the other space-filling
models. One is the continuity of the normal direction, an-
other the continuity of the maximum principal curvature.
Both properties are crucial for the construction of good
quality meshes, which may be used to support numerical
computations over the surface. Another interesting prop-
erty is an inside-outside symmetry that implies the exis-
tence of locally perfectly complementary molecular skin
models. In other words, for each cavity we may construct
a molecular skin representation whose boundary matches
that of the molecule. The molecular skin also lends itself
to represent deformations, and some of the possibilities
along these lines will be discussed in Chapter VIII.
This chapter is organized in four sections. In Section
III.1, we give the geometric definition of the molecular
skin and show how it can be decomposed into quadratic
patches. In Section III.2, we discuss various notions of
curvature of a surface, and we show that the maximal
principal curvature is a continuous map over the molec-
ular skin. In Section III.3, we describe the algorithm that
constructs a molecular skin in terms of a triangle mesh. Fi-
nally in Section III.4, we present software for constructing
molecular skin in two- and three-dimensional space, and
35
36 III S URFACE M ESHING
III.1 Molecular Skin Pencils. It is possibly easier to develop an intuition for

combining circles than for combining paraboloids. Given
Almost everything we will say in this section applies two intersecting circles and , the affine hull con-
* *
sists of all circles that pass through the same two inter-

equally well to spheres of any fixed dimension. Even
* *
section points. Indeed, if then
though the case of spheres in is most relevant for the

for all coefficients and .

study of molecules, there is sufficient pedagogical advan-
We call the resulting family a pencil of circles. If and
tage to first talk about circles in .

are disjoint then the affine hull is again a pencil but this
time of pairwise disjoint circles, like the vertical family

Circles and paraboloids. Recall that the weighted sketched in Figure III.2. We compute the center and ra-
square distance function of a circle

is the

map

defined by

*
. F* F

As illustrated in Figure III.1, its graph is a paraboloid

of revolution in
that intersects in the circle.
In other words, the circle is the zero-set of the weighted
square distance function,
. All paraboloids
* * * * 4 * ) *

that arise as weighted

square distance functions have the
form . The three pa-

rameters correspond to the three degrees of freedom rep-
resented by the center and the radius.
Figure III.2: Circles sampled from a coaxal system consisting of

two orthogonal pencils.
dius of the zero-set of . We have
F* F

FH* F

FH* F
F F
F F F F

Figure III.1: A circle in is the zero-set of its weighted square

and the square

distance function.

The center is therefore
radius is

F F F F

F F .

Functions form a vector space under the usual notions
The centers of the circles in the affine hull are therefore the
points on the line that passes through and . If instead
of scaling and addition. We will use only a subspace of

that vector space, namely the one consisting of functions
of the affine hull we take the convex hull, then we get the
of the above form. Given a collection of such functions

, we can generate another such function by affine combi-
subset of circles whose centers are the points on the line

nation, , where the are real numbers with segment with endpoints and .

. The new function is a convex combination
is orthogonal to if
Recall that a circle
of the if all
are non-negative. Given a collection of F F . If is orthogonal to

circles, , the affine hull is the set of zero-sets of affine and to then it is also orthogonal to every circle in the
combinations of the corresponding weighted square dis- affine hull of and . To see this elementary fact, note
tance functions, and similarly the convex hull is the subset that
F F
of zero-sets of convex combinations,

F F F F

2

III.1 Molecular Skin 37
which is
and thus vanishes as required.
for fixed value of . The collection of all reduced
circles
Suppose we are now given two circles and and two is the projection of the entire zero-set, . It can be

more circles and both orthogonal to and . Then visualized as a leaning hour-glass of circles, as in Figure

every circle in the affine hull of and is orthogonal to III.4. The envelope of is the projection of the silhou-

ette of as viewed along the direction. It is the
*
*
both and and thus to every circle in the affine hull of
set of points for which

and . In other words, we have two pencils in which
@* :* :* @* J* *

each circle in the first pencil is orthogonal to each circle vanishes. From we get . The envelope is

in the second pencil. Such a configuration is illustrated in therefore the zero-set of ,

Figure III.2 and is referred to as a coaxal system. which is a hyperbola.
Envelopes. The convex hull of two circles is an infinite Skin and body. More general curves than just hyperbo-
family of circles, but the union of their disks is just the las can be constructed by taking the convex hull of a fi-
union of the two original disks. We introduce a shrinking nite collection of circles, then shrinking every circle in the
operation that reduces small circles less than big ones and family, and finally taking the envelope. Formally, the skin
this way generates a smooth

envelope. Specifically, we de-

of the collection of circles is the envelope of the reduced
. The body is the union of
fine
we define
. Similarly, for a family of circles circles,

disks bounded by circles in 2
. An example can be
. It is the region in
seen in Figure III.3, which sketches a shrunken pencil of bounded by the skin, and symmetrically, the skin is
circles. the boundary of the body. The smallest non-trivial exam-
ple is the skin of two circles. If these circles intersect in
two points then the skin is a dumbbell, as shown in Figure
III.5. It consists of two circles connected by a blending
hyperbola arc.
Figure III.3: The dotted circles belong to the affine hull and the
solid circles are reduced.
Figure III.5: The skin of two intersecting circles is the envelope
of a reduced line segment of circles.

We are interested in the envelope of a shrunken pencil.
*
Suppose is a pencil and
all its circles pass through the

points and . We parametrize by the - The skin of three circles is already more difficult to un-

coordinate of the circle centers. The corresponding ra- derstand, at least directly. We thus take an indirect ap-
dius is

. The same parametrization of the family proach and first study what happens when orthogonal cir-
of reduced circles, , gives cles shrink.
:* * *

*

The reduced circle with center

is the zero-set of

Orthogonality and complementarity. Let
and

be two orthogonal circles. We thus have
F F

Taking roots left and right implies that the radii of and
add up to at most the distance between the two cen-

Figure III.4: Sections of the zero-set of viewed from the posi- ters. Furthermore, we have equality iff . In other
tive direction. words, the reduced versions of any two orthogonal circles
touch if they are of the same size and they are disjoint in skin of consists of circles, connected to each other by
all other cases. blending hyperbola and inverted circle arcs. We will not
prove this claim and instead give an explicit construction

We apply this result to the coaxal system consisting of
of the decomposition, which is facilitated by a complex

orthogonal pencils and . Suppose contains only cir-
assembled from Voronoi and Delaunay polyhedra.

cles with real radii, or equivalently, is the affine hull of

two intersecting circles. As shown earlier, the envelope of

As usual, we let be an index set and use it to de-
2

is a hyperbola. We claim that the envelope of is note the Voronoi polyhedron . The corre-
the exact same hyperbola. To see this, we first note that sponding Delaunay simplex is .

a circle in

can at most touch the hyperbola, for if it The corresponding mixed cell is the Minkowski sum of

crossed, we would have two crossing reduced circles con- shrunken copies of both, . If
the mixed cell is the shrunken and translated

tradicting the orthogonality of the two corresponding orig- copy of a
inal circles. Furthermore, every circle in two-dimensional Voronoi cell. If then

for which is
there is an equally large circle in

touches the hyper- the Minkowski sum of two orthogonal edges and there-
bola because it touches that circle. The two envelopes are fore a rectangle. If
then is a shrunken and
therefore the same hyperbola. As shown in Figure III.6, translated copy of a Delaunay triangle. The mixed complex
the two asymptotic lines of the hyperbola intersect at a consists of all mixed cells and their faces. Figure III.7 il-

right angle. The smallest separating circle that touches lustrates the construction by showing the mixed complex
both branches belongs to and has the same size as the decomposing the skin into circle and hyperbola arcs. A
two osculating circles that both belong to . These cir-
cles touch the hyperbola and have the same curvature as
the hyperbola at that point.
Figure III.7: The mixed complex and the skin of four circles.
rather intuitive explanation of the construction can be ob-

tained by drawing the Voronoi diagram and the Delaunay
Figure III.6: Hyperbola with orthogonal asymptotic lines, small- triangulation on two parallel planes in . We decompose
est separating circle, and two osculating circles. the slab between the two planes into pyramids and tetrahe-
dra, which are the convex hulls of corresponding Voronoi
polyhedra and Delaunay simplices. The mixed complex is

The complementarity of the bodies extends from the then obtained by intersecting the pyramids and tetrahedra
case of two orthogonal pencils to the case in which con-

sists of a single circle and contains all circles orthog-
with the plane parallel to and halfway between the other
two planes, as sketched in Figure III.8.

onal to . The set is a two-parameter family spanned
by three circles. The skin of is trivially a circle, which
implies that the skin of is the same circle. Symmetry. Note that the construction of the mixed
complex is symmetric in the Voronoi diagram and the De-
launay triangulation. In other words, the mixed complex
Decomposition. The skin of any finite set of circles
can be decomposed into simple pieces, each defined by at
of
is the same as the mixed complex of the collec-
tion of circles introduced in Section V.1. [The order
most three of the circles. A single circle defines a (smaller) of the chapters on skin and pockets has changed now,
circle, a pair of circles defines a hyperbola, and a triplet of
circles defines an inverted circle. We thus claim that the
centered at each
which requires a local rewrite here and in Section III.4.]
As explained there, contains a circle
III.1 Molecular Skin 39
[2] M. G. DARBOUX . De points, de cercles et de spheres. An-

nales de L’Ecole Normale, Series 2 (1872), 323–392.
[3] H. E DELSBRUNNER . Deformable smooth surface design.

Discrete Comput. Geom. 21 (1999), 87–115.
[4] G. F ROBENIUS . Anwendungen der Determinantentheorie

auf die Geometrie des Masses. J. Reine Angew. Math. 79
(1875), 185–247.
[5] D. P EDOE . Geometry: a Comprehensive Course. Dover,

New York, 1988.
Figure III.8: The top, middle, and bottom planes carry the Delau-
nay triangulation, the mixed complex, and the Voronoi diagram.

Voronoi vertex (including those at infinity) with the ra-

dius chosen so that is orthogonal to the circles that de-

fine . The Voronoi diagram of is then the Delaunay
triangulation of , the Delaunay triangulation of is the

Voronoi diagram of , and the mixed complexes of and
are the same. We have seen that the skins of two orthog-
onal pencils are the same hyperbola. Similarly, the skins
of one circle and the affine hull of three orthogonal circles
are the same circle. Since the mixed complex decomposes

the entire skin of into such cases, it follows that the skin
of is the same as that of . Note however that the two

bodies are not the same but rather complementary,

Bibliographic notes. There is another interpretation of

the vector space of circles exploited in this section. It

9F F
identifies each circle in with the point

in . Under this interpretation, the convex
hull of a set of circles corresponds to the usual convex hull

of points in , and the symmetry between and can

be explained as a polarity between two convex polyhedra.
This interpretation is prominently used in the geometry
text by Pedoe [5]. It has been discovered in the nineteenth
century and published at more or less then same time in
three different languages by Clifford [1], Darboux [2], and
Frobenius [4].
The material of this section is taken from [3], where

skin surfaces are introduced as orientable
-
manifolds in . That paper also proves that the body of
a finite collection of spheres has the same homotopy type
as the dual complex.
[1] W. K. C LIFFORD . Problem 1748. Mathematical Questions

and Solutions from the Educational Times 44 (1865), 144.
III.2 Curvature
an open set in , and

a parametrization.

Derivatives are taken along curves on the surface. For ex-

ample, to compute the tangent plane at

, we take
*

The skin curves introduced in Section III.1 generalize
the tangent vectors of two curves that cross at . They span
straightforwardly to surfaces in . In this section, we
study the curvature of these surfaces. The Curvature Vari- the tangent plane, as illustrated in Figure III.10. Similarly,
ation Lemma proved at the end of this section will play
a major role in the meshing algorithm to be discussed in
Section III.3. There are several notions of curvature of a x y
surface, and all are obtained by considering the curvature f
of curves drawn on the surface.
Curves. A closed space curve is a map

three-dimensional space,

of a circle to
. It is smooth if Figure III.10: Construction of tangent plane from two tangent
vectors.
the derivatives of all orders exist. Usually we need only

a small number of derivatives, and the assumption of the we define the curvature at in sections. For
each curve
existence of infinitely many is convenient but not neces-
*
in the plane we consider the space curve

. It is a
sary. Note that a curve has a parametrization and the
counter-clockwise orientation of the circle gives a sense
,
geodesic at
normal at . The curvature of

if its normal agrees with the surface

consists of a portion

F - - , , F 1 ,
of direction. The velocity vector at the point forced by how the surfaces
is

and the speed is the length of that vector,
in space and another
curves
portion accounting for how curves within the sur-
, - , F - , F
. The tangent vector is the normalized velocity vec-

face. The second contribution vanishes for geodesics, and
tor, , which is defined as long as the if it does we call the normal curvature of at <1 ?

. There
speed is non-zero. We can think of as the Gauss map
in the direction of the tangent vector
from to , as illustrated in Figure III.9. we get a
is a circle of tangent vectors, and for each one
normal curvature. The principal curvatures at are the
minimum and maximum normal curvatures,

Let and be the corresponding tangent directions. By

a result of Euler, the principal curvatures determine all
other normal curvatures at .

2
E ULER ’ S T HEOREM . The directions and are or-

thogonal, and if then

2
Figure III.9: A closed space curve to the left and its Gauss map

to the right.

then all other normal cur-

, - , and the second derivative, 5 , , is normal to

It is often convenient to assume unit speed. In this case This implies that if

vatures are strictly between the two principal curvatures,
F 5 , F . The normal

which are therefore unique. If then all normal
tive, ,
the first. The curvature is the length of that second deriva-

, 5 , GF vector
5 , F ,iswhich
the normalized curvatures are the same and the point is an umbilic point

as long as ,
second derivative, is defined of the surface. Two other common notions
of curvature

. Geometrically, the curvature is one
over the radius of the osculating circle at
, , which sian curvature,

are the mean curvature,
. In contrast
, and the Gaus-
to the other no-
is the circle in the plane spanned by the tangent vector and tions, the Gaussian curvature is intrinsic. In other words, it
the normal vector. is preserved by isometries, which are transformations that
preserve the distance between points measured as lengths
of connecting paths. This is a famous result of Gauss.

Surfaces. Let
. For a point

be a smooth surface or 2-manifold in
, we let be a neighborhood,

T HEOREMA E GREGIUM .
is an isometric invariant.
III.2 Curvature 41
Skin surfaces. Recall that the skin defined by a finite

spheres with indices in intersect in a circle, touch in a
set of circles in is the envelope of the infinite fam- point, or are disjoint. Either way, the body lies on the side
ily of circles in the convex hull, each reduced by a fac- of the infinite circle in the symmetry plane.
tor . Furthermore, the mixed complex defined by
the circles decomposes the skin into circle and hyperbola
Maximum normal curvature. We can translate and ro-

arcs. Similarly, the skin of a finite set of spheres in
is . The mixed complex that tate every sphere and hyperboloid to standard form, which
we define as
* * *
decomposes the surface consists of the four types of cells
illustrated in Figure III.11. Within each mixed cell, we
* * *

*
The second equation defines a hyperboloid with the apex
at the origin, the symmetry axis along , and the sym-
metry plane
* . We have a one-sheeted hyperboloid
for and a two-sheeted one for , as illustrated in
Figure III.12. For the sphere, the normal curvature at ev-

Figure III.11: Typical mixed cells
to right we have
and 4.
. From left
have a sphere or a hyperboloid patch. The hyperboloid

can either be one-sheeted (an hour-glass) or two-sheeted.
The cases are summarized in Table III.1. The two sphere

mixed cell skin patch
1 3 0 convex polyhedron sphere
2 2 1 polygonal prism hyperboloid
3 1 2 triangular prism hyperboloid Figure III.12: The sphere, the one-sheeted hyperboloid, and the
4 0 3 tetrahedron sphere two-sheeted hyperboloid.
Table III.1: The cardinality of listed in the first column deter- ery point is in every tangent direction. The situation is
mines the dimensions of the corresponding Voronoi polyhedron
and Delaunay simplex as well as the type of the mixed cell and
more complicated for the hyperboloid. Consider the hy-
perbola in standard form in , as shown in Figure III.13,
of the skin patch. and note that both the one-sheeted and the two-sheeted hy-
perboloid can be obtained by rotating the hyperbola about

cases are symmetric and differ from each other by the sur-
a symmetry axis. In either case, the maximum normal cur-
face orientation: in the case

, the body lies
locally inside, and in the case , it lies locally
outside the sphere. Similarly, the two hyperboloid cases

are symmetric and differ from each other by the surface
orientation. In the case , the symmetry axis
of the hyperboloid is the affine hull of the Delaunay edge
and the (orthogonal) symmetry plane is the affine hull of
the Voronoi polygon. We have a one-sheeted hyperboloid r
r
if the two spheres intersect in a circle and a two-sheeted x
r
one if they are disjoint. The common limiting case is a
double-cone defined by two touching spheres. Either way,

the body is on the side of the infinite ends of the symmetry
axis. In the case
, the symmetry plane is the
affine hull of the Delaunay triangle and the symmetry axis Figure III.13: Every point of the hyperbola is sandwiched be-
is the affine hull of the Voronoi edge. Whether the hyper- tween two equally large circles.
* * , is one over the radius of

boloid is one-sheeted, a double-cone, or two-sheeted de-
pends on whether the two spheres orthogonal to the three vature at a point ,

the largest sphere that passes through and touches but * [3] H. E DELSBRUNNER . Deformable smooth surface design.

does not cross the hyperboloid. As shown in Figure III.13,
*
* / *
this radius is the same as the distance of from the ori-

gin. In short,
[4] B. O’N EILL . Elementary Differential Geometry. Second
for every point of a sphere or
edition, Academic Press, San Diego, 1997.
hyperboloid in standard form.
Curvature variation. The maximum normal curvature

varies continuously over the skin because the common
radius of the sandwiching spheres varies continuously.
We strengthen the result by showing that varies rather

In fact, we extend to a function defined on all
slowly.
of and show that has Lipschitz constant one. We

have seen that within a mixed cell, is simply the dis-

tance to the center, . By the definition of the mixed com-
plex, this is a continuous function on . Within the mixed
cell, the triangle inequality gives the Lipschitz bound,
F* F F F
*

FH* F
*
By applying this to the pieces of the line segment from
to contained in different mixed cells, we obtain the
result.
C URVATURE VARIATION L EMMA . For all points

*
we have
FH* F

*

We note that the extension of to a function

describes the maximal normal function of all skin surfaces
in the family defined by the power growth model of the
spheres, as introduced in Section II.2.
Bibliographic notes. The books by Bruce and Giblin

[1] and by O’Neill [4] are good introductory texts to

curves and surfaces and other topics in differential geome-
try. The skin surfaces in

are obtained by extending the
results of Section III.1 by one dimension, from to .
A more direct treatment of the general-dimensional case
can be found in [3]. The specific results on the curvature
and the curvature variation of skin surfaces are taken from
[2].
[1] J. W. B RUCE AND P. J. G IBLIN . Curves and Singularities.

Second edition, Cambridge Univ. Press, England, 1992.
[2] H.-L. C HENG , T. K. D EY, H. E DELSBRUNNER AND J.

S ULLIVAN . Dynamic skin triangulation. Discrete Comput.
Geom. 25 (2001), 525–568.
III.3 Adaptive Meshing 43
III.3 Adaptive Meshing point , the restricted Voronoi cell is

* FH* F FH* F

In this section, we focus on constructing an explicit rep-

resentation of a molecular skin surface. We choose a tri- where distance is measured in , as usual. It is the in-
angle mesh realized in that is a good approximation of tersection of with the Voronoi polyhedron of

in
the surface and has good numerical properties. ,
. The restricted cells decompose
into closed regions that overlap along common pieces

of their boundaries. Locally the picture is rather simi-
Triangulations. Recall that a triangulation of a surface

lar to that of a Voronoi diagram in . The restricted
is a simplicial complex whose underlying
space is homeomorphic to . Since is a 2-manifold,

Delaunay triangulation,
plices
, is the collection of sim-
with non-empty common
it follows that the simplicial complex is the closure of its

intersection of the corresponding restricted Voronoi cells,
triangle set, every edge belongs to exactly two triangles,
and the star of every vertex forms a disk. Note that the last

. The construction is illustrated
property implies the first two. We construct a triangula-

in Figure III.14. We note that
is a subcomplex of the
(unrestricted) Delaunay triangulation of in .
tion by first selecting points on and second connecting
these points with edges and triangles. Given the Delau-
nay triangulation of , we have sufficient information to Closed ball property. One trouble with the restricted

sample points and to compute their maximum normal cur- Delaunay triangulation is that it may not be homeomor-

vature values. Specifically, for each Delaunay simplex
phic to and thus not triangulate the surface. Indeed,

we construct the mixed cell . The cen- it is easy to come up with cases where is not even
ter of this cell is the point at which the affine

hull of a 2-manifold. A sufficient condition for to triangu-
intersects the affine hull of . It is also the center of late is what we call the closed ball property. It requires
the corresponding sphere or the apex of the corresponding that each common intersection of restricted Voronoi cells

hyperboloid. Next, we rotate the mixed cell so its center is topologically a closed ball of the appropriate dimen-

moves to the origin. Furthermore, if or is an edge sion. We formulate this condition in terms of the three-

then we rotate it into vertical position. The sphere or hy- dimensional Voronoi polyhedra defined by . Assuming

perboloid defined by is then in standard form, which can general position, the Voronoi polyhedron

has dimension

, and we require that

be sampled. For each sampled point we compute the max-
imum normal curvature from its distance to the origin and
is either empty or homeomorphic to a closed ball
we obtain the corresponding point on by the inverse

of dimension
. Depending on the cardinality
rotation.
of we have a closed disk, a closed interval, or a single
point.
Figure III.15: To the left a barycentric subdivision of a portion

of a Voronoi diagram drawn with solid lines. To the right the
Figure III.14: Local decomposition into restricted Voronoi cells isomorphic barycentric subdivision of the corresponding portion
and dotted dual restricted Delaunay triangulation. of the dual Delaunay triangulation drawn with dashed lines.
Let be the set of points sampled on . We use it as Proving that the closed ball property implies

tri-
the vertex set of the triangulation, which we construct as angulates is not difficult. Decompose the restricted
the dual of a decomposition of . Specifically, for each Voronoi diagram by adding a point in the middle of each
arc and inside each cell and connect each point to the arbitrarily ugly. To improve the mesh, we impose condi-
points on the boundary. The star of every point inside a re- tions on the size of edges and triangles that imply both
stricted cell is a triangular decomposition of that cell. The upper and lower bounds on the spacing between sampled

star of every restricted Voronoi vertex consists of six tri- points.
angular regions that can be homeomorphically mapped to
,

Let the size of an edge be half its length,
the six triangles in the barycentric subdivision of the dual

and the size of a triangle be the radius of its circumcir-
restricted Delaunay triangle. By construction of , the
cle, . For edges we worry about them getting too
triangles in the two barycentric subdivisions are connected
the same way so we have a homeomorphism between at the endpoints,

short, so we compare size with the larger length scale

. For trian-

and the underlying space of , which is illustrated in

gles we worry about them getting too large, so we com-

Figure III.15.

pare size with the minimum length scale at the vertices,

. We use two constants,
-sampling. The question remains how we sample the and , to express the conditions on the size. The constant
controls how closely the triangulation approximates ,

points such that the restricted Voronoi diagram has the
closed ball property. Since is smooth, small neigh- and controls the quality of the triangles. We refer to the
borhoods are fairly flat and the restricted Voronoi diagram two conditions as the Lower and Upper Size Bounds,
behaves locally similar to the (unrestricted) Voronoi dia- for every edge ,

for every triangle

.
gram of a set of points in the plane. In other words, a [L]
[U]
dense enough sample of points should have the closed ball

*
property. This intuition can be made precise by formaliz-

ing the concept of density. Recall that
It is not necessary to bound the edge lengths from above
* *
is the max-
because an edge with

. Around we would belong to
*
imum normal curvature at a point
spread points at distance roughly proportional to .

We therefore define

*

* two triangles that both violate [L]. Symmetrically, we do
*
and call it the length

not need to bound the triangle sizes from below because
*
scale at . The Curvature Variation Lemma of Section

III.2 states that for any two points , the differ- a triangle with
that violate [L].
would have three edges
* FH*
ence in length scale is at most the distance between them

F
in ,
.

such that for each

Mesh quality. The constants and have to be chosen
*
An -sampling is a subset
there judiciously. For example would immediately lead
FH* F *
point exists a point at distance

. Showing that a sufficiently small
implies the closed ball property for the restricted Voronoi

to irreconcilable requirements on edge and triangle sizes.
Furthermore, cannot be too large, else we would con-
diagram is rather tedious and we omit the proof. tradict the -sampling condition stated in the Homeomor-
and

phism Theorem. Without going into details, we state that

are feasible choices. In particu-
with
H OMEOMORPHISM T HEOREM . If is an -sampling of
, then the restricted Delau-
lar, these constants imply that is an -sampling for suffi-
nay triangulation of is homeomorphic to . ciently small value of . More precisely, they imply that
is either an -sampling or it grossly violates the condition
for -sampling. An example of such a gross violation are
The precise upper bound for is a root of the function
four points close together on a sphere. The points form a

2

tetrahedron whose edges and triangles may very well sat-
isfy the Size Bounds, but the boundary of the tetrahedron

is a miserable approximation of the much larger sphere.
which arises in the proof of the Homeomorphism Theo- Fortunately, such a gross violation of the condition cannot
rem. be created from an -sampling without the intermediate
generation of triangles that grossly violate [U]. The algo-
rithm discussed below is unable to generate such triangles.
Even sampling. The points of an -sampling can locally
not be too far apart, but they can be arbitrarily close to- The two Size Bounds together imply a reasonably large
gether. In other words, on a microscopic scale, the points lower bound on the angles inside triangles of the restricted
can be placed every way one likes and the mesh can be Delaunay triangulation.
III.3 Adaptive Meshing 45
M INIMUM A NGLE L EMMA . A triangle that satisfies [U] violate the Upper Size Bound. It is possible that an edge
and whose edges satisfy [L] has minimum angle
larger than .
8 contraction causes a vertex insertion, but a vertex inser-
tion cannot create edges of size below the allowed thresh-

old. This is what prevents infinite loops in spite of the

P ROOF. Let be the triangle and its cir- algorithm’s partially conflicting efforts to simultaneously
F
F

cumradius. Assuming is the smallest angle, we avoid short edges and large triangles. To prove this claim,

*+
have of length
as the short- we consider a triangle that causes the addition of its
est edge. We have by definition of length dual restricted Voronoi vertex .
* created dur-
scale. Using [L] and [U] we thus get
N O -S HORT-E DGE L EMMA . Every edge
ing the addition of has ratio * / / .

F F

8 .

*
P ROOF. We have . The sphere with

*
Hence

center that passes through , , and has radius

*
and contains no other vertices than in-
FH* F

For
, the minimum angle is thus larger than
side. Every new edge has therefore length

, and the maximum angle is smaller than

. Assume without loss of generality that

.
. We use the Curvature Variation Lemma to
derive upper bounds for the length scales at and : *
Density modification. Given an -sampling, we can en-
force the Size Bounds by contracting short edges and in- * F * F

serting points near the circumcenters of large triangles.

* F* F FH* F
*
Given a triangle that violates [U], we add the dual
restricted Voronoi vertex as a new point to . The inser-

tion may cause new violations of [U] and thus trigger new Hence
point insertions.
/ FH* F *

void V ERTEX I NSERTION:
while triangle
*
violating [U] do
For

and
we have

and
endwhile.

therefore

/ / , as claimed.

*
The details of the algorithm that modifies the restricted Scheduling. [Summarize the results on scheduling edge
Delaunay triangulation to reflect the addition of are contractions and vertex insertions described in [5].]
omitted. A vertex insertion may cause other vertex in-
sertions, but this cannot go on forever because we will
Bibliographic notes. The restricted Delaunay triangula-

eventually violate the Lower Size Bound. Given an edge
that violates [L], we contract it by removing one of its tion is a generalization of the dual complex of a ball union.
endpoints. We are not able to exclude the possibility that It can be used to triangulate surfaces and other spaces em-
the removal creates new violations of [L], and it certainly bedded in a Euclidean space. Besides the dual complex
can create new violations of [U]. literature, there are several other partially dependent roots
of the idea, namely the surface meshing method by Chew
[3], the neural net work by Martinetz and Schulten [6],

void E DGE C ONTRACTION:
while edge the formulation of the closed ball property by Edelsbrun-

violating [L] do
if

then endif; ner and Shah [4], and the surface reconstruction algorithm
by Amenta and Bern [1]. The last of the four papers also
; V ERTEX I NSERTION
endwhile. introduces -samplings of surfaces, although in a slightly
different formulation in which the distance to the medial
The details of the algorithm are again omitted. An edge axis replaces the length scale.
contraction may perhaps cause other edge contractions, All results that are specific to skin surfaces are taken
but this cannot go on forever because we will eventually from [2]. The algorithm in that paper is more general than
what is explained in this section and maintains the surface

mesh while it moves in space.
[1] N. A MENTA AND M. B ERN . Surface reconstruction by

Voronoi filtering. Discrete Comput. Geom. 22 (1999), 481–
504.
[2] H.-L. C HENG , T. K. D EY, H. E DELSBRUNNER AND J.

S ULLIVAN . Dynamic skin triangulation. Discrete Comput.
Geom. 25 (2001), 525–568.
[3] L. P. C HEW. Guaranteed-quality mesh generation for

curved surfaces. In “Proc. 9th Ann. Sympos. Comput.
Geom., 1993”, 274–280.
[4] H. E DELSBRUNNER AND N. R. S HAH . Triangulating topo-

logical spaces. Internat. J. Comput. Geom. Appl. 7 (1997),
365–378.
[5] H. E DELSBRUNNER AND A. Ü NG ÖR . Relaxed scheduling

in dynamic skin triangulation. In “Japanese Conf. Comput.
Geom., 2002”, to appear.
[6] T. M ARTINETZ AND K. S CHULTEN . Topology representing

networks. Neural Networks 7 (1994), 507–522.
III.4 Skin Software 47
III.4 Skin Software

In this section, we use two pieces of software to visualize
the various geometric concepts introduced earlier in this
chapter.
Skin curves. The Morfi software is two-dimensional

and constructs skin curves from finite sets of circles. In
Figure III.16 we see seven disks whose union is decom-
posed into convex regions by the Voronoi diagram. Su-
perimposed on this decomposition is the skin curve with
shaded body and the dual complex. Note that the disk
Figure III.17: Decomposition of the skin and body by the mixed

complex.
portion of the hole boundary inside that quadrangle is cir-

cular while the portions outside the quadrangle are hyper-
bolic. Observe also that the five Delaunay polygons vis-
ible within the mixed complex apparently have eight ver-
tices (not double-counting the shared ones). We see only
seven of them in Figure III.16 because one of the eight
radii is imaginary. Where is its center in Figure III.16?
Figure III.16: Voronoi decomposition of disk union with super- Simulated smoothing. We return to an issue left open in
imposed skin, body, and dual complex.

Section V.1, where we considered the minimum weighted
square distance function of a collection of

union contains the body and the body contains the dual
,
circles . The zero-set of is the envelope of the cir-

,
complex. Furthermore, the disk union, the body, and the cles , and the preimage of any real value is
dual complex all have the same homotopy type. This is
the envelope of the circles
,
. Following the
always true. The skin shrinks the arcs in the boundary of
,
notation in Section II.3, we think of as time and de-
the disk union and smoothly blends between the shrunken note the collection of circles at time by
. In
arcs using pieces of hyperbolas and inverted circles. Most

Section V.1 we claimed that there is an infinite family of
striking is the blending for the quadrangular hole roughly smooth approximations of that all have
in the middle of the figure, which is converted into an al- the same critical points, namely the points where dually
most entirely circular hole in the body.

corresponding Voronoi
and Delaunay polyhedra intersect.
We choose and construct the family such that
Mixed complex. Using the Morfi software, we can visu- and approaches as goes to 1. One function

*

,;
alize concepts that are difficult if not impossible to show in this family is the trajectory of the skin curve, , that
to the moment in time
*
in . An example is the mixed complex illustrated in maps each point
;
Figure III.17. It decomposes the skin into circular and hy- at which belongs to the skin of . We generalize this
construction to any

perbolic arcs. As explained in Section III.1, it consists of by letting be the trajec-

shrunken Voronoi polygons, rectangles, and shrunken De- tory of the modified skin curves. Specifically, the -skin
launay polygons. The collection of circles generating the is the envelope of the circles in the convex hull that are
diagram in Figure III.17 is degenerate, which can be seen reduced by a factor ,
from the fact that there are three shrunken Delaunay trian-

2

2
gles but also two shrunken Delaunay quadrangles. One of

the quadrangles contains most of the hole in the body. The

Note that is the skin as defined in Section III.1,
and
is the envelope of the original disks. Figure
III.18 illustrates the construction by showing the modi-
fied skins for several values of . Observe that the bod-
Figure III.18: From inside out the sequence of skins for Figure III.19: Cut-away view of the mesh of a small molecule
.

of about forty atoms. Only the edges of the mesh and the cut
boundary are shown.
ies bounded by the -skins are nested. As it turns out, the

innermost -skin, defined for , is also the envelope dering of the same surface in Figure III.20. The appar-

*
of the orthogonal circles as defined in Section III.1. The ent smoothness is an illusion created by Gouraud shading,
function to the mo-
* , *
maps every point which is a graphics technique that interpolates between

ment in time at which belongs to
, C
, normal directions to generate the smooth impression. Note
as usual. For

with , the height function that highly curved areas detectable in Figure III.20 corre-
is differentiable and assuming non-degeneracy of the spond to high density regions in Figure III.19.
input circles, it is twice differentiable at the critical points.
This is sufficient to justify the Morse theoretic reasoning
about the non-smooth function used in Section V.1 to Growing the mesh. As mentioned earlier, the mesh is
define pockets. constructed by maintaining it while growing the spheres.
The algorithm thus reduces to executing a sequence of ele-
mentary operations. We classify the operations according
to the adaptation purpose they serve.
Meshed skin surfaces. In , we compute triangulated
skin surfaces using the Skin Meshing software. It takes as
input a set of spheres and constructs a mesh by main- Shape adaptation. The growth of the spheres im-

plies a deformation of the surface, which is facilitated
,
taining a triangulation of the set of spheres , with the
time continuously increasing from minus infin- by a motion of the mesh vertices in . The algo-

ity to zero. At the beginning, all spheres are imaginary, rithm moves vertices normal to the surface, along the

the skin is the empty surface, and the mesh is the empty
integral
lines of the skin trajectory, which is
. We use edge flips to maintain the mesh as
,
complex. As time increases, the surface moves and the
software updates the mesh accordingly. At time , the restricted Delaunay triangulation of the moving
we have the mesh of the skin of . Figure III.19 shows vertices.
a portion of this mesh for a small molecule. The image Curvature adaptation. Recall that the conditions
is created by slicing the surface with a plane and remov- [L] and [U] given in Section III.3 guarantee that the
ing the front portion of the surface. The complete surface mesh adapts its local density to the maximum nor-
has genus one, and the slicing plane is chosen to cut right mal curvature. We use edge contractions to eliminate
through the narrow part of the tunnel. The image of the edges that violate [L] and vertex insertions to elimi-
mesh in Figure III.19 should be compared with the ren- nate triangles that violate [U].
III.4 Skin Software 49

are , , and , which control how the metamorphoses
are performed. The correctness of the algorithm is guar-
anteed only if the inequalities referred to as Conditions (I)
to (V) are all satisfied. The software permits other param-
eter settings since a violation of the inequalities does not
necessarily imply a failure of the algorithm. In our ex-
perience, the software works fine for small violations but
breaks down for moderate ones.
Figure III.20: Smoothly shaded rendering of the mesh in Figure

III.19.
Topology adaptation. There are four types of Figure III.22: The quantification panel of the Skin Meshing soft-
topological changes that occur, and they correspond ware. The quality measures do not include the special edges and
to the four types of generic critical points of three- triangles that facilitate topological changes and purposely violate
dimensional Morse functions. A component is born some of the properties required for the rest of the mesh. [This
at a minimum, a handle is created at an index-1 sad- panel needs to be updated to fit the text.]
dle, a tunnel is closed at an index-2 saddle, and a void
is filled at a maximum. We use metamorphoses to
Figure III.22 shows the panel after the construction of a
change the mesh connectivity accordingly.
mesh. It displays measurements of mesh quality, includ-
ing size versus length scale ratios of edges and triangles
Two of the four types of metamorphoses can be seen at and the angles inside and between triangles. Note that in

work in Figure III.21. From the first snapshot to the sec- Figure III.22, the ratios all lie inside the allowed interval,
ond, we see two new handles appear. Each handle creates
which is . As proved in Section III.3, the algo-
8
a tunnel in the complement. From the second snapshot to rithm guarantees that the smallest angle inside any (non-
the third, we see both tunnels disappear again. By closing
a tunnel we also remove the handle that forms it. Observe the standard setting of

special) triangle in the mesh is larger than . For
, this is roughly ,
that the surface around a handle is the same as that around and the smallest angle observed in the mesh is indeed
a tunnel, namely a two-sheeted hyperboloid that flips over .
to a one-sheeted hyperboloid, or vice versa. The only dif-
ference is the reversal of inside and outside.
Bibliographic notes. The two-dimensional Morfi soft-

Quantification. The Skin Meshing software comes with ware has been developed by Ka-Po (Patrick) Lam, and is
a quantification panel that displays parameters used in described in his master thesis [4]. The software has been
the meshing algorithm, provides various measurements of used in [2] to explain two-dimensional skin geometry and
mesh quality, and indicates the number of operations ex- its application to deforming two-dimensional shapes into

ecuted during the construction. The two most important
parameters are , which controls the numerical approxi-
each other. The three-dimensional Skin Meshing software
has been developed by Ho-Lun Cheng [1, 5]. Computer

mation of the surface, and , which controls the size of graphics techniques used in displaying shapes, including
the angles. The three other parameters shown in the panel Gouraud shading, can be found in [3].
Figure III.21: Three snap-shots of the deforming triangulation of a molecular skin defined by continuously growing spheres. From left
to center, we note two metamorphoses that each add a handle in the front. From center to right, we note a metamorphosis that closes a
tunnel on the left.
[1] H.-L. C HENG . Dynamic and Adaptive Surface Meshing un-

der Motion. Ph. D. thesis, Dept. Comput. Sci., Univ. Illinois,
Urbana, 2001.
[2] S.-W. C HENG , H. E DELSBRUNNER , P. F U AND K. P.

L AM . Design and analysis of planar shape deformation. In-
ternat. J. Comput. Geom. Appl. 19 (2001), 205–218.
[3] J. F OLEY, A. VAN DAM , S. F EINER AND J. H UGHES .

Computer Graphics. Principles and Practice. Second edi-
tion, Addison-Wesley, Reading, Massachusetts, 1990.
[4] K. P. L AM . Two-dimensional geometric morphing. Master

thesis, Dept. Comput. Sci., Hong Kong University of Sci-
ence and Technology, 1996.
[5] Molecular Skin web-site in the software collection at

biogeometry.duke.edu.
Exercises 51
passing through and . Similarly, we write

Exercises

and
for the heights of and . Prove that the radius
1. Pencils of spheres. Let us extend the concept of a of the circumcircle satisfies
coaxal system of circles to three dimensions. For this
F
F F C
F
purpose assume and are two sphere
orthogonal to the spheres , and .
that are both

F
F F
EF

F
EF

F
CF
(i) Prove that every affine combination of and
is orthogonal to , and .

(ii) Prove that every affine combination of ,

and is orthogonal to and .

(iii) In the light of (i) and (ii), what is the analog of
a coaxal system in ?
2. Curvature in the plane. Note that the curvature
of a molecular skin curve in is not
continuous.
(i) Give an example illustrating that is not con-
tinuous.
(ii) Introduce a new function (perhaps similar to )
that is continuous over .
3. Total curvature. Define the total curvature of a sur-
face as the integral of the maximum principal cur-
vature:
/
* *

(i) Calculate for a sphere .

(ii) Calculate for the portion of a double-cone
within a unit-sphere around its apex.
4. Total square curvature. Define the total square cur-

vature of a surface as the integral of the maximum
principal curvature squared:
/

* *

(i) Calculate for a sphere .
(ii) Let be the portion of a hyperboloid of rev-

olution within a unit sphere around the apex.
Show that goes to infinity as the hyperboloid
approaches its asymptotic double-cone.
(iii) Prove that the number of points in a minimal
-sampling of (as defined in Section III.3 is
proportional to .

5. Something about triangles. Let be a triangle
in the plane. We write for the height of defined
as the distance of from the closest point on the line
Chapter IV
Connectivity
Given a shape or a space, we can ask whether or how define homology groups and their ranks, the Betti num-
it is connected. It might not be immediately obvious what bers. In Section IV.3, we describe an incremental algo-
this question means, we can draw from precise definitions rithm for Betti numbers, which is fast but limited to com-
developed in topology to answer the question. However, plexes in three dimensions. In Section IV.4, we present
we need to be aware that there are perfectly well-defined the classic matrix algorithm for Betti numbers, which is
and reasonable but different precise notions that corre- significantly slower but not limited to three-dimensional

spond to the intuitive idea of connectivity. For example, space.

for two spaces and to be “connected the same way”,

could mean they are topologically equivalent ( ),

they are homotopy equivalent ( ), or they have iso-

morphic homology groups ( . The three IV.1 Equivalence of Spaces
notions are progressively weaker: IV.2 Homology Groups

IV.3 Incremental Algorithm
IV.4 Matrix Algorithm
Exercises
In words, the classification of spaces by homology groups
is coarser than that by homotopy equivalence, which in
turn is coarser than that defined by topological equiva-
lence. [We should stress that homology in this topological
context has a precise algebraic meaning, which is in sharp
contrast to how the term is used in biology (eg. homology
modeling of proteins), where it indicates a vague notion of
similarity.]
Given two triangulated spaces, there is a polynomial-
time algorithm that computes and compares their homol-
ogy groups. If the groups are not isomorphic then we
know that the two space are different, meaning they are
neither homotopy equivalent nor topologically equivalent.
However, if their homology groups are isomorphic then
we still do not know whether the two spaces are the same
also under the two stricter definitions of sameness. In spite
of the apparent weakness, homology is the most important
tool to study connectivity. In this chapter, we focus on
algorithms computing the homology groups of molecules
represented by space-filling diagrams. In Section IV.1, we
prove that space-filling diagrams are homotopy equivalent
to their dual alpha shapes, which implies the two have iso-
morphic homology groups. In Section IV.2, we formally
53
54 IV C ONNECTIVITY
IV.1 Equivalence of Spaces Topological equivalence. Now that we know what a

topological space is, we can define when two are the same.
A homeomorphism is a bijective map

that is con-
The space-filling diagram of a molecule is a subset of ,

and with induced subspace topology it is a topological tinuous and whose inverse is continuous. We write
if a homeomorphism exists and say that and are home-
space. We study the connectivity of this space by con-
omorphic, topologically equivalent, and that they have the
sidering equivalence classes defined by continuous maps
between spaces. same topological type. Note that the identity is a homeo-
morphism, the inverse of a homeomorphism is a homeo-
morphism, and the composition of two homeomorphisms

is a homeomorphism. In other words, being homeomor-
Topological spaces. Recall that a map

is phic is reflexive, symmetric and transitive, so is indeed

there is a such
* *
continuous if for every an equivalence relation for topological spaces.
that if

have distance less than then the points
have distance less than . To check As suggested by Figure IV.1, there are spaces that have
whether or not is continuous, we thus have be able to the same topological type and look vastly different, and

measure the distance between points in both sets. Accord- there are spaces that look quite similar and do not have the
same topological type. An interesting example of a pair of

ing to a more general definition, is continuous if the
preimage of every open set in is open in . Here we
only need to distinguish between open and non-open sets.
This distinction is the motivation for the following defini-

tion. A topological space is a set together with a system
of subsets of such that
(i)
and ,
(ii)
for every subsystem , and Figure IV.1: The circle on the left is topologically equivalent
(iii)
for every finite subsystem . to the trefoil knot in the middle, but both are not topologically
equivalent to the annulus on the right.

are the open sets of . If , we can induce the
The system is called the topology of and the sets in

non-homeomorphic spaces are the sphere and the plane.
subspace topology, which is the system

After embedding both in , we can map points from the

. The space together with the system is a
sphere to the plane by stereographic projection from the
north-pole, , as illustrated

in Figure IV.2. This map be-
topological subspace of the pair . tween and is indeed a homeomorphism, but

there is no homeomorphism between and .
To get comfortable with these abstract ideas requires a

number of concrete examples. Here is one. Let
be the three-dimensional Euclidean space. An open ball is
N
the set of points at distance less than some from a
fixed point, and an open set is a union of open balls. Note
that the common intersection of finitely many open sets
is again open, but this is not necessarily true for infinitely

many open sets. For example, the common intersection

of the open balls of points at distance less than
from the origin, for , is just the origin itself,
which is not an open set. We thus see that the restriction
*
to finite subsystems in condition (iii) is necessary. The
two-dimensional sphere,
FH*KF
Figure IV.2: The stereographic projection maps the sphere (mi-
,
nus the north-pole) to the plane. The lower hemisphere maps to

is a subset of , and if we choose its intersections with
the shaded disk and the upper hemisphere to the complement of
open sets in

as the open sets in its topology, then it is a that disk.

topological subspace of . Another topological subspace

of is the two-dimensional Euclidean plane, .
IV.1 Equivalence of Spaces 55
Homotopy equivalence. Next we introduce an equiva- (iii) , , for all and all ,& .
lence relation that is less sensitive to the local dimension
Note that is a homotopy between , which is the iden-

of spaces than topological equivalence. We begin by com-
tity on , and , which maps to . As illustrated in

paring maps between the same spaces. Two continuous

maps

are homotopic if there is a continu-
* *
Figure IV.4, there is a deformation retraction from the dou-
ous map with
*

* *+
and ble annulus to the figure-8 curve, but there is no deforma-

and call

, for all . We write tion retraction to the circle. (Why not?)
,'
a homotopy between and . This definition is illustrated
in Figure IV.3. We may think of the parameter as
im k
im H
Figure IV.4: The arrows indicate a deformation retraction from

im h
the double annulus to the figure-8 curve.
Figure IV.3: In this example, and both map the circle into
three-dimensional space, and maps the circle times to

D EFORMATION R ETRACTION L EMMA . If is a defor-

the cylinder connecting the two images of the circle. mation retraction from to then and are ho-
motopy equivalent.
sweep out the image of by the images of the
1 * * ,
time and
. The only requirements has to satisfy
is that it starts with , ends with P ROOF. We construct maps and
*

*

, and that with the required
properties. Define

and
it is a map. For example, is not required to be injective,

. Then is homotopic to the identity on
which is the same as saying that the image of may be
because
is a homotopy between the two maps. Fur-
self-intersecting. is equal to the identity on

thermore, and therefore

Two spaces and are homotopy equivalent if there certainly homotopic to it.

are
continuous maps and

such that
is homo-
The simplest homotopy type is that of a point. A space

is homotopic to the identity on and is contractible if it is homotopy equivalent to a point. For
topic to the identity on . We write and say that example, a disk is contractible but a circle is not. Simi-

the two spaces have the same homotopy type. Note that
larly, a ball is contractible but a sphere is not.
is reflexive, symmetric, and transitive and is therefore
indeed an equivalence relation for topological spaces. It is
easy to show that two topologically equivalent spaces are Decomposition into joins. We construct a deformation
also homotopy equivalent. To see that the reverse is not retraction between a union of balls and its dual complex
true we note that the annulus in Figure IV.1 is homotopy

using a decomposition into joins. In general, a join be-
equivalent to the circle, but the two are not topologically

tween two sets and in some Euclidean space is the
equivalent.

union of closed line segments that connect points in
with points in ,
/ & *
Deformation retraction. If is a topological subspace

of then we may prove that the two spaces are homotopy

equivalent by constructing a map that retracts to . A

deformation retraction
from to is a continuous and it is defined iff any two such line segments are either
map with disjoint or meet at a common endpoint. Figure IV.5 uses
(i) * * , for all *+

two kinds of joins to decompose the difference between
,
(ii) * , for all *
the union and the dual complex of a set of disks, namely

, and triangles and disk sectors. A triangle is the join between a
56 IV C ONNECTIVITY

boundary of . We shrink * by defining
,

, ,

for every point on the line segment *

. A triangle in
the decomposition shrinks from its outer vertex towards
the opposite edge, which belongs to the dual complex. It
,
turns into a trapezium whose height decreases and reaches
zero at time . A disk sector shrinks from its outer arc
towards its center, which is a vertex of the dual complex.
It maintains its shape while getting smaller until it reaches
the size of a point. The deformation retraction is obtained
Figure IV.5: The union of disks is decomposed into the underly- by shrinking all joins simultaneously. It is illustrated in
ing space of the dual complex and two types of joins connecting
,
Figure IV.6, which shows the image of the retraction at
that complex to the boundary of the union.
time . Figure IV.7 shows an entire sequence of
shapes during the deformation retraction visualized for the
model of gramicidin also shown in Figure II.3.
point and an edge and a sector is the join between a circu-
lar arc and a vertex.

Let be a finite collection of closed balls in . We as-
sume general position and construct a deformation retrac-

tion from the union, , to the underlying space
of the dual complex, . Recall that the bound-
ary of consists of sphere patches separated by circular

arcs connecting corners. To be specific, we define a patch
as the contribution of the sphere bounding to the
boundary of . It does not have to be connected or sim-
ply connected. Similarly, we define an arc and a corner
as the contribution of the intersection of two and of three
spheres to the boundary of . An arc may be a full cir-
Figure IV.6: The decomposition after shrinking the joins half
cle, or any number of intervals along the circle. A corner
way to zero.
may be empty, a point, or a pair of points. The decom-
position is constructed by forming the join between every
patch, arc, and corner and its dual vertex, edge, and trian- There is a technical problem at the very beginning of
gle. Figure IV.5 illustrates the construction in the plane. the shrinking process that arises already in two dimen-
There are four corners that are point pairs, and they corre- sions. Specifically, the outer vertex of each triangle join
spond to the four principal edges of the dual complex. (As belongs to more than one line segment and thus retracts
defined in Section II.4, an edge is principal if it is not face towards more than one point of the dual complex. To fi-

of any other simplex in the complex. In the Alpha Shape nesse this difficulty, we choose and move the points
software, such an edge is referred to as singular.) There differently in the time interval . In the assumed case
are also four arcs that consist of more than one component in which is in general position, this initial motion needs
,
each, and they correspond to the vertices on the boundary to bridge the non-zero gap between the boundary of and
of the dual complex that are exposed to the outside in more the boundary of the image of at time . By choosing
than one interval of directions. small, we can make the gap arbitrarily small and easy to
bridge.

Shrinking
joins. We get a deformation retraction Bibliographic notes. Homeomorphisms, homotopies,
from to by shrink- and deformation retractions are covered in most texts of
* *
ing joins from outside in. Each join is the union of line algebraic topology, including Seifert and Threlfall [6] and
segments with on the boundary of and on the Munkres [5]. Subtleties of the definitions of a topology
IV.1 Equivalence of Spaces 57
Figure IV.7: Six snap-shots of the deformation retraction from the union of balls representation of gramicidin to the dual complex.
and of a topological space are discussed in texts on gen- les points fixes des représentations. J. Math. Pure Appl. 24
eral topology, including Kelley [2] and Munkres [4]. (1945), 95–167.
The particular deformation retraction used to prove the [4] J. R. M UNKRES . Topology. A First Course. Prentice Hall,
homotopy equivalence between a union of balls and its Englewood Cliffs, New Jersey, 1975.
dual complex is taken from Edelsbrunner [1]. That equiv-
[5] J. R. M UNKRES . Elements of Algebraic Topology. Addi-
alence can also be derived from general theorems about
son-Wesley, Redwood City, 1984.
coverings. The Nerve Lemma says that a space is homo-
topy equivalent to the nerve of a finite open cover whose [6] H. S EIFERT AND W. T HRELFALL . A Textbook of Topology.
sets have either empty or contractible common intersec- Academic Press, San Diego, California, 1980.
tions. We can turn the Voronoi cells of a union of balls
into such a cover and get the homotopy equivalence re-
sult from that lemma. The history of the Nerve Lemma is
complicated because different versions have been discov-
ered independently by different people. Maybe the paper
by Leray [3] is the first publication on that topic.
[1] H. E DELSBRUNNER . The union of balls and its dual shape.

[2] J. E. K ELLEY. General Topology. Springer-Verlag, New

York, 1955.
[3] J. L ERAY. Sur la forme des espaces topologiques et sur

58 IV C ONNECTIVITY
*
. The quotient divided by , denoted as
IV.2 Homology Groups , is the collection of cosets. Addition in the quotient
group is defined by *

*
This section introduces homology groups as an algebraic
means to characterize the connectivity of a topological . We note that it does not matter which representatives
space. To keep the discussion reasonably elementary, we we choose in computing the sum of the two cosets. The
*
restrict it to triangulated spaces and to addition modulo 2. resulting coset is always the same, so addition is indeed
well defined. Observe that implies
Triangulations. In the preceding chapters, we have

G
talked about triangulations in an intuitive geometric sense.
In topology, the term has a precise meaning, which we x+y+ H
2
now develop. A simplex is the convex hull of an affinely

independent point set, . If has cardinality

then has dimension and is also referred x+ H y+ H

to as a -simplex. A face of is the convex hull of a

H

subset , and we write . Since has sub-

sets, has the same number of faces, including the empty

0
set and as its two improper faces. A simplicial complex
is a finite collection of simplices with pairwise proper
intersections that is closed under the face relation, that is, Figure IV.8: Partition of into cosets defined by for the case
in which contains a quarter of the elements.
(i) if and then , and
then is either empty or a face of * * . So if * and then
(ii) if * . In words, two cosets are either disjoint or

both.
same cardinality and

that
the same. If is finite this implies
.
all cosets have the

Recall that the underlying space of

is the union of all
A homomorphism between groups and is a function
simplices, . A simplicial complex can be
that commutes with addition, *

*
used to represent a topological space, and we have seen
. The kernel of is the subset of whose
elements map to , and the image is the subset of
an example in Section II.3, where the dual complex of a
space-filling diagram was used to represent a molecule.
We proved in Section IV.1 that the underlying space of whose elements have preimages in :
the dual complex is homotopy equivalent to the space-
* *

filling diagram. A topologically more accurate represen-

$*+
with *

tation would have a homeomorphic underlying space. We

thus define a triangulation of a topological space as a

simplicial complex whose underlying space is topolog-
ically equivalent, . The remainder of this section
An isomorphism is a bijective homomorphism. Its kernel
is the zero element of and its image is the entire .
introduces the algebraic concepts we will use to define ho-
mology groups of triangulated spaces. Chain complex. Let

be a simplicial complex. We

construct groups by defining what it means to add sets of

Abelian groups. A group is a set together with an as-
sociative operation for which there is a
simplices. Call a set of -simplices a -chain. By defini-
tion, the sum of two -chains is the symmetric difference
of the two sets,

zero and an inverse for every group element. The group is
abelian if the operation is commutative. Examples are the

infinite group of integers with addition,

, and the fi-

nite cyclic group of elements, mod . A subset

forms a subgroup if is a group.
This is like adding modulo 2 where , since a

is a subgroup. We chain belongs to

iff it belongs to neither or to both

* ,

Suppose chains. is the set of -chains and
have + , and because implies *
is abelian and is the group

there is a bijection between and each coset *

of -chains. The zero of this chain group is the empty
set. We connect chain groups of different dimensions by
IV.2 Homology Groups 59
homomorphisms that map chains to their boundary. For

this purpose we define Ck+1 Ck Ck−1

. The boundary of a chain is the sum of boundaries of its Z k+1 Zk Z k−1
simplices, . Observe that the boundary of

the sum of two chains is the sum of their boundaries,
Bk+1 Bk Bk−1

. This assumes of course that and have the k+2 k+1 k k−1

same dimension, else would not be defined. We thus
,

have a boundary homomorphism 0 0 0
for every . The sequence of chain groups connected by
boundary homomorphisms is the chain complex of ,
Figure IV.9: The chain complex and the groups of cycles and
8
boundaries contained in the chain groups.

. If then

group, is the trivial
Figure IV.9 illustrates the sequence but contains informa-

group consisting only of one element. The size of is a
tion about subgroups that will be introduced shortly.

measure of how many -cycles are not -boundaries. The
cosets are the elements of and are referred to as homol-
Cycles and boundaries. There are two types of chains ogy classes.

that are particularly important to us: the ones without As an example consider a triangulated torus, as

boundary and the ones that bound. A -cycle is a -chain
sketched in Figure IV.10. All 0-chains are 0-cycles and
with . The set of -cycles is the kernel of the -
. Two -cycles half of them are 0-boundaries,

namely the ones with even

th boundary homomorphism,

cardinality. Hence
4
. The two non-

add up to another -cycle, which implies that is
bounding 1-cycles labeled and generate a first homol-
a subgroup of

. A -boundary is a -chain for
ogy group of four elements, as shown in Figure IV.10. It

which there exists a -chain with . The set is isomorphic to , which is the group of elements

of -boundaries is the image of the

-st boundary

with component-wise addition

homomorphism,

. Two -boundaries add
modulo 2. There is only one non-empty 2-cycle, ,

up to another -boundary, which implies that

is a

and no non-empty 2-boundary, . Hence

subgroup of . We prove that is a subgroup

.

of . Equivalently, the boundary of every boundary
is empty.
0 a b a+b
F UNDAMENTAL L EMMA OF H OMOLOGY. . 0 0 a b a+b

b
a a 0 a+b b

P ROOF. Note that for every

-simplex . b b a+b 0 a

This is because every -simplex belongs to exactly a
a+b a+b b a 0
two -simplices. The rest follows because taking bound-
ary commutes with adding:

Figure IV.10: The curves and represent the homology classes
and , which generate the homology group .

An important property of homology groups is that they
are the same for triangulations of homeomorphic and of
homotopy equivalent spaces. In particular, we get the
which is the empty set, as required. same homology groups for different triangulations of a
We can therefore draw the relationship between the sets topological space. Similarly, the homology groups of (any
of chains, cycles, and boundaries as sketched in Figure triangulation of) a union of balls are the same as the ho-
IV.9. mology groups of the dual complex. In other words, the
homology groups are properties of the space and not arti-

facts of the complexes used to represent that space. Prov-

Homology groups. The -th homology group is the quo- ing that this is really the case is beyond the scope of this
tient of the -th cycle group divided by the -th boundary book.
60 IV C ONNECTIVITY
Betti numbers. The most useful aspects of homology Since is a homomorphism,
,

groups are their ranks, which have intuitive interpretations and , we have

in terms of the connectivity of the space. The concept of
a rank applies equally well to chain, cycle, boundary and
* * * rewrite this relation as

homology groups. All these groups are idempotent, that
. Earlier we derived
for every . Given a subset
Using corresponding lowercase letters for ranks, we

is, of such a
. The number of -simplices in the complex

group , we can form all sums of elements in and thus

is also the rank of the chain group, , hence
generate a subgroup. This operation can also be expressed

in the terminology of linear algebra, where the subgroup
is knows as the linear hull,

, consisting of all ,

with and . This subset is
it is minimal and generates the entire group,

a basis if
.

Even though there is no unique basis, all bases have the

same size, and because is idempotent, that size is the
binary logarithm of the number of group elements. By
definition, the rank of is the size of a basis:

. If the group is the -th homology

We state this result because it is important and so we can

group of a space, , the rank is known as the -th

Betti number of that space: . Since use it for later reference.

we have
E ULER -P OINCAR E´T HEOREM . .

This relation can often be used to quickly find the Euler
Revisiting the example above, we see that the Betti num- characteristic of a space without constructing a triangula-

bers of the torus are , and . The tion and counting simplices. For example, the closed disk
homology groups of dimensions

are all trivial

has one component, no non-bounding loop, no shell, and
and the corresponding Betti

numbers are all zero. For the

therefore . Similarly, the Euler charac-

closed disk we have
and therefore ,

,

and
and

. Similarly for
teristic of the two-dimensional sphere is
and that of the torus is
. Note that this

implies that the disk, the sphere and the torus are pairwise
the two-dimensional sphere we have , and
non-homeomorphic. This is hardly surprising but not easy
. As for the torus, all other Betti numbers vanish.
to prove with elementary means. Indeed, two spaces with
In general, the 0-th Betti number is the number of con- different Euler characteristics have homology groups that

nected components. To see this remember that a 0-cycle
bounds iff it contains an even number of vertices in
are different in at least one dimension. In this case, the
spaces are neither homotopy equivalent nor topologically
each component. Note also that exactly half of the subsets equivalent.

of a finite set have even cardinality. If there are compo-

nents and

vertices then
and
8
. Bibliographic notes. Homology groups have been de-
It follows that . Similar to veloped at the end of the nineteenth and the beginning of
, the 1-st and 2-nd Betti numbers have intuitive interpre- the twentieth centuries. The French mathematician Henri
tations as the number of independent non-bounding loops Poincaré is usually credited with the conception of the idea
and the number of independent non-bounding shells. [4]. He named the ranks of the homology groups after the
English mathematician Betti, who introduced a slightly
different version of the numbers years earlier. The begin-

Euler characteristic. Consider a simplicial complex ning of the twentieth century witnessed parallel develop-

and let be the number of its -simplices. By defi- ments of homology groups that differed in the elements
these numbers:

nition, the Euler characteristic is the alternating sum of
. We show that is
they added (simplices, cubes, general cells, ...) and the co-
efficient groups they used ( , , , , ...). Eventually, all

also the alternating sum of Betti numbers. Note that if this work was unified by axiomizing the assumptions un-
is a homomorphism, then the rank of is der which homology groups exist [1]. Today, homology
equal to the sum of ranks of the kernel and the image. is a general method within algebraic topology. We refer
IV.2 Homology Groups 61
to Giblin [2] for an intuitive introduction to that area and

to Munkres [3] and Rotman [5] for more comprehensive
sources.
[1] S. E ILENBERG AND N. S TEENROD . Foundations of Alge-

braic Topology. Princeton Univ. Press, New Jersey, 1952.
[2] P. J. G IBLIN . Graphs, Surfaces and Homology. Chapman

and Hall, London, 1981.
[3] J. R. M UNKRES . Elements of Algebraic Topology. Addi-

son-Wesley, Redwood City, 1984.
[4] H. P OINCAR É . Complément à l’analysis situs. Rendiconti

del Circolo Matematico di Palermo 13 (1899), 285–343.
[5] J. J. ROTMAN . An Introduction to Algebraic Topology.

62 IV C ONNECTIVITY

IV.3 Incremental Algorithm Observe that the four cases follow one and the same rule:

if belongs to a non-bounding cycle in then we in-
The Betti numbers of a simplicial complex can be com- crement the Betti number of the dimension of and,

puted incrementally, by adding one simplex at a time. otherwise, we decrement the Betti number of dimension
In this section, we describe the details of this algorithm,
one less than that of . This is justified by the equa-

tion
which is particularly well-suited for filtrations. developed in Section IV.2: adding
a -simplex always increments the rank of the -th chain

group, and it does this by either incrementing the rank of
Adding a simplex. We analyze what happens to the
the -th cycle group or that of the

-st boundary

Betti numbers when we add a simplex to a complex .

group.
Let
and assume that all proper faces of

belong to , so is also a complex. By observing how

fits into , we can determine the Betti numbers of from Algorithm. To compute the Betti numbers of a complex,

we form a filtration that ends with that complex:

those of . In the case analysis, we mention only the Betti
numbers that change.

Case andis athus
vertex. Being a vertex, cannot connect to

All are complexes, and it is convenient to assume that

forms a component by itself. Therefore,

.
any two contiguous complexes differ by only one simplex:

. For example, we may sort the sim-

Case is an edge. There are two sub-cases depending on plices in non-decreasing order of dimension and take all
whether the endpoints of belong to the same com- prefixes of that sequence. Alternatively, we may use the
ponent or to two different components. Both cases

filtration of a Delaunay triangulation introduced in Sec-
are illustrated in Figure IV.11. In the first case, we

tion II.3. In the latter case, the filtration contains all alpha
have , and in the second case
complexes and we get the Betti number of all of them in
. one sweep. The algorithm is but a simple scan along the
filtration.
u v
u v
integer

B ETTI:

;

to do

for ;
The edge if belongs to a -cycle in then ++

Figure IV.11: closes a loop on the left and
connects two components on the right. else --

endif

Case is a triangle. Again we have two sub-cases, both endfor;

illustrated in Figure IV.12. If completes a 2-cycle

return .

then

. Otherwise, closes a

tunnel and we have . The only difficult part of the algorithm is deciding whether

or not belongs to a -cycle. We study this problem after
illustrating the algorithm for a small example.
σj σj Betti numbers of the dunce cap. The dunce cap is best

created from a triangular piece of soft cloth. As illustrated
in Figure IV.13, all three sides are equally long and are
Figure IV.12: To the left, the triangle completes a surface, while glued to each other with matching orientations. To run
to the right, it just closes a tunnel formed by the surface holes. our algorithm, we need a triangulation of the dunce cap.
It is not difficult to construct one, but we have to avoid

Case is a tetrahedron. Assuming is a complex in pitfalls such as creating edges that share more than one
, it cannot have any 3-cycle. Adding can there- endpoint and triangles that share more than one edge. A
fore only turn a non-bounding 2-cycle (its boundary)
into a 2-boundary. Hence,

.
valid triangulation is shown in Figure IV.14. When we
run our algorithm, we first add all vertices, then all edges,

IV.3 Incremental Algorithm 63
Classifying vertices and edges. We now return to the

problem of deciding whether the addition of a simplex in-
creases the rank of a cycle group or that of a boundary
group. In the former case, we say the simplex creates, and

in the latter case it destroys. All vertices create, but edges
can create or destroy. For example, the edge in Fig-
ure IV.11 creates on the left and destroys on the right. To
Figure IV.13: In the first step, we glue two sides of the triangle, distinguish between the two cases, we maintain the com-
thus forming a cone with a seam. In the second step, we glue the ponents of the complex throughout the filtration using a
seam along the rim of the cone (not shown).
union-find data structure, which represents a system of
pairwise disjoint sets: the elements are the vertices and
1 the sets are components of the complex at any moment
in time. The data structure supports three types of opera-
tions:
3 7 6 3
F IND

return the set that contains vertex .

2
8 5
2
U NION

substitute

for the sets and

in
the system.
9
4
D A DD

add as a new singleton set to the system.
A B C
1 2 3 1 The algorithm scans the filtration from left to right and
classifies each vertex and each edge as either creating or
Figure IV.14: A triangulation of the dunce cap. destroying:

to do

for
and finally all triangles. After adding the thirteen vertices, case is a vertex :

we have , and . The evolution creates; A DD ;

F IND
of Betti numbers while adding the edges in lexicographic case is an edge :

; F IND ;
order is shown in Table IV.1. There are 27 triangles in the

if then creates
destroys; U NION

12 13 16 17 19 1A 1C 1D 23 25 else
12 11 10 9 8 7 6 5 5 4
endif
0 0 0 0 0 0 0 0 1 1
endfor.

28 29 2A 2B 2D 35 36 37 38 3B
3 3 3 2 2 2 2 2 2 2

1 2 3 3 4 5 6 7 8 9
Standard implementations of the union-find data structure

3C 45 46 47 48 49 4A 4B 4C 4D
2 1 1 1 1 1 1 1 1 1 take barely more than constant

time per operation. To
10 10 11 12 13 14 15 16 17 18
be more precise, let
be the extremely fast growing

56 5D 67 78 89 9A AB BC CD
1 1 1 1 1 1 1 1 1 Ackermann function. Its inverse is extremely slow grow-
19 20 21 22 23 24 25 26 27 ing. To get a faint idea of how slow the inverse grows,

Table IV.1: Evolution of and while adding the edges of the
we note that
any constant, but
cannot be bounded from above by

unless is larger than

triangulation in Figure IV.14.
the estimated number of electrons in the universe. Any
sequence
of operations takes time at most proportional
triangulation, each closing a tunnel and thus decrementing
to . For all practical purposes, this means that
. Indeed, no collection of triangles has zero boundary,
each operation takes only constant time.
which can be proved by observing that three edges belong
to three triangles each and all other edges belong

to two
triangles each. The final result is therefore and Classifying triangles and tetrahedra. In three-dimen-
. Indeed, the dunce cap is connected, all sional Euclidean space, every tetrahedron destroys but tri-

its closed curves bound, and the surface formed by the angles can destroy or create. Deciding whether or not a
triangles does not enclose any volume in . triangle belongs to a cycle is not quite as straightforward
64 IV C ONNECTIVITY
as it is for an edge. However, with an extra assumption tetrahedra, but this is exactly what compactification does
on the filtration, we can use the dual graph of the com- for us when it adds tetrahedra outside the boundary tri-
plement to classify triangles and tetrahedra the same way angles of the Delaunay triangulation. The running time
as we classified edges and vertices. The most convenient for classifying all triangles and tetrahedra is again propor-

tional to .

version of this assumption is that the last complex in the
filtration,

, is a triangulation of . Think of as the

one-point compactification of . Given a Delaunay tri-
Summary. The entire algorithm consists of three passes
angulation in , we can construct such a triangulation by
adding a dummy vertex and connecting it to all bound- over the filtration:
ary simplices of the Delaunay triangulation.
1. a forward pass to classify all vertices and edges,

In and also in , every closed surface bounds a vol- 2. a backward pass to classify all triangles and tetrahe-
ume. In other words, a triangle completes a 2-cycle dra,
iff it decomposes a component of the complement into
two. We keep track of the connectivity of the complement 3. a forward pass to compute the Betti numbers.
through its dual graph, whose nodes are the tetrahedra and
Figure IV.16 illustrates the result of the algorithm. In the
whose arcs are the triangles. Figure IV.15 illustrates this
first two passes, we maintain a union-find

data structure,
construction in two dimensions. Adding a triangle to the
which takes time proportional to . The third
pass does only a constant amount of work per step, namely
incrementing or decrementing a counter. The
total running

time is therefore at most proportional to .
Figure IV.15: A subcomplex of the Delaunay triangulation and

the dual graph of the complement. The region outside the Delau-
nay triangulation is represented by a single node.
complex effectively removes an arc from the dual graph

of the complement. Deciding whether removing an arc
splits a component is more difficult than deciding whether
adding an arc connects two components. We therefore
scan the filtration backward, from right to left:
Figure IV.16: The evolution of the Betti number (the num-

downto do

for
ber of tunnels) in the filtration of gramicidin, which is shown in
case is a tetrahedron:
Figures II.3 and II.15.

destroys, unless

, in which case it creates;

A DD ;

case is a triangle:

let and be the tetrahedra that share ; Bibliographic notes. The incremental algorithm for

F IND ; F IND ; computing Betti numbers described in this section is taken

if then destroys from [2]. It exploits the fact that the connectivity of

else creates; U NION the complex determines the connectivity of the comple-
endif ment. This relation is a manifestation of Alexander dual-
endfor. ity, which is studied in algebraic topology [3, Chapter 3].
This algorithm has been implemented as part of the Al-
The algorithm requires that each triangle is shared by two pha Shape software, which computes the Betti numbers of
IV.3 Incremental Algorithm 65
typically thousands of complexes in the filtration of a pro-

tein structure in less than a second. The key to achieving
this performance is a fast implementation of the union-find
data structure,
namely one with running time proportional

to for operations. The details of such an
implementation can be found in most algorithm texts, in-
cluding [1, Chapter 22]. A proof that the running time

cannot be improved from to has been given
by Tarjan [4].
[1] T. H. C ORMEN , C. E. L EISERSON AND R. L. R IVEST.

Introduction to Algorithms. MIT Press, Cambridge, Mas-
sachusetts, 1990.
[2] C. J. A. D ELFINADO AND H. E DELSBRUNNER . An incre-

mental algorithm for Betti numbers of simplicial complexes
on the 3-sphere. Comput. Aided Geom. Design 12 (1995),
771–784.
[3] A. H ATCHER . Algebraic Topology. Cambridge Univ. Press,

England, 2002.
[4] R. E. TARJAN . A class of algorithms which require nonlin-

ear time to maintain disjoint sets. J. Comput. System Sci. 18
(1979), 110–127.
66 IV C ONNECTIVITY
IV.4 Matrix Algorithm hj hj − h s hs
In this section, we develop the linear algebra view of ho-

mology and formulate a matrix algorithm for computing
gi
Betti numbers. After explaining the algorithm both for
addition modulo two, we extend it to integer addition. +
gr gr + gi

Incidence matrices. Let be a simplicial complex
-simplices and
+

with -simplices . The
-th incidence matrix is

Figure IV.17: The effect of elementary row and column opera-
tions on the bases of and
.

..

..

..

..
.

. . . matrix, but it is still describes a correspondence between

bases of and
. The matrix is in normal form if
!
its non-zero entries are lined up along an initial segment
where iff is a face of . Using this notation, of the main diagonal, as illustrated in Figure IV.18. We
we can write the -th boundary homomorphism in matrix can use Gaussian elimination to transform the incidence

!
form: matrix into normal form.

:

to

for do

if NON Z ERO then

Recall that the form a basis of the -th chain group, forall rows

do

, and similarly the form a basis of . The above if
endfor;
then row

row row endif
formula thus expresses the boundary of every basis ele-

ment of as a sum of basis elements of
. To make forall columns
if
then col
do
col
col endif

this interpretation of the incidence matrix useful for com-
endfor
puting Betti numbers, we need to consider more general
endif
bases. These can be generated by performing elementary
endfor.
row and column operations:

The algorithm uses a boolean function NON Z ERO that
exchange row with row ;
makes sure that during the -th iteration the -th diagonal
add row to row ; entry,
, is non-zero. It does this by exchanging rows
exchange column with column ; and columns. The function fails to make non-zero iff
add column to column . all entries in the remaining sub-matrix are zero.
:

boolean NON Z ERO :

Exchanging two rows or columns is equivalent to re-

while and do

indexing the or . As illustrated in Figure IV.17,
;
; --
assume w.l.o.g. that
col
adding row to row has the effect of replacing by
. Adding column to column has the effect of if col then col
with

row

replacing by . (Since we deal with idempo- else find ;
row
tent groups, subtraction is the same as addition.) Note that
the effect is not symmetric: the basis of changes at endif
the modified row, while the basis of
changes at the endwhile;
return .
modifying column.
We use the phrase “assume without loss of generality”

Normal form algorithm. After a few elementary row as a short-form for expressing that there is another case,
and column operations, is no longer the -th incidence namely , that can be handled symmetrically. The al-
IV.4 Matrix Algorithm 67
,
.
gorithm consists of three nested loops. Letting this function as a formal polynomial:
the running time is therefore at most proportional to 4 4 4

where 4 is the function value of . We add two -

Deriving the Betti numbers. Suppose we have trans-

chains componentwise, by adding the coefficients of like

formed all incidence matrices of into normal form.

simplices:
As illustrated in Figure IV.18, the -th matrix has

4

rows and
columns. The zero-rows correspond to
-cycles, of which we have many. It follows that the
4

number of non-zero entries along the main diagonal is

. The -th Betti number is the rank of

By definition, the boundary of is the

alternating sum of ordered -simplices obtained by
1 bk −1 ck −1
dropping one vertex at a time:

1

1
bk −1
where the hat marks the deleted vertex. We can check that
zk the boundary is independent of the ordering, as long as it
ck belongs to the same orientation, and that it is the nega-

tive boundary for an ordering of the opposite orientation:

Figure IV.18: The normal form of the -th incidence matrix. . Similarly, we can check that the Funda-

mental Lemma of Homology still holds: . As

before, we define the group of -chains, , the group

the -th cycle group minus the rank of the -th boundary

group: of -cycles, , and the group of -boundaries, . The

. We can thus derive

the Betti numbers from the sizes and numbers of non-zero -th homology group is again , and the

entries in the normal form matrices. -th Betti number is the rank of that homology group:
We note that the ranks of the incidence matrices suffice
.
for computing the Betti numbers and it is not necessary to
go all the way to normal form. Either way, the running Torsion. A curious new phenomenon that arises with the
time of the algorithm is cubic in the number of simplices
in the complex.
use of integer addition is algebraic torsion. It does not oc-
cur for spaces that can be embedded in , so it is not part
of people’s immediate experience. Maybe the simplest
topological space whose homology groups have torsion is
Integer coefficients. The matrix algorithm can be ex- the Klein bottle. It can be constructed from a rectangular
tended to coefficients in instead of . Before dis-

cussing the necessary modifications, we talk about what
this means in terms of adding simplices and chains. We 1 4 5 1

start at the beginning.

An ordered -simplex is an ordering of the vertices

2 3
of a -simplex, and we write . Two

ordered simplices have the same orientation if their order- 3 2
ings differ by an even number of transpositions. Each sim-
plex has two orientations, except if it is a vertex, in which

1 4 5 1
case it has only one. To set the stage, we give each simplex

in an arbitrary but fixed orientation, and for a given ori- Figure IV.19: A triangulated rectangular piece of paper glued to

ented simplex , we write for the other orientation of form a Klein bottle.

the otherwise same simplex. A -chain is a function from
the -simplices to the integers. It is convenient to write piece of paper by gluing opposite sides as shown in Figure
68 IV C ONNECTIVITY

IV.19. Since it has torsion,we know that the Klein bot-
tle cannot be embedded in , and when we draw it, we
If we get a positive integer smaller than
gle column operation. Symmetrically, if

in a sin-
we get such
have to allow for a self-intersection. The 1-cycle marked a positive integer in a single row operation. Otherwise,
around the neck of the bottle does not bound, but twice we may assume that
divides both
and , and we
that 1-cycle bounds. This is what causes torsion. To de-

can make

zero with a row operation. By adding row

scribe the phenomenon more generally, we need the fact to row we keep unchanged and we change to
that every finitely generated abelian group is isomorphic
, which is not an integer multiple of . Now we
to a direct sum (Cartesian product) of copies of and of get a positive integer smaller than
in a single column
cyclic groups: operation, as before. Since
divides every entry in the

8 remaining sub-matrix, it will also divide the future non-
zero diagonal entries. Hence, the algorithm generates the
torsion coefficients with the required properties.

Furthermore, we may require that all are larger than one The running time of the algorithm is no longer guaran-
and that divides

fixes and the indices . The abelian group

for each . This extra condition
is thus the
teed to be at most cubic in the number of simplices. In-
deed, the sequence of operations is sensitive to the size
direct sum of a free subgroup, namely , and the rest,

of the integers that arise, and it is not even clear whether
namely

, which is referred to as its torsion
subgroup. The are the torsion coefficients. The rank of
or not it is polynomial in the input size. As for coef-
ficients, we can determine the homology groups directly

of , which is . For

the group is the number of copies from the normal forms of all incidence matrices. We get

the Klein bottle, we have , and the rank of the -th homology group from the -th and the

for addition modulo 2 and , -st normal form matrices: .

and for integer addition. We thus get different We get the torsion coefficients from the -st normal

Betti numbers for addition modulo 2 and for integer addi- form matrix: they are the diagonal entries that exceed one.
characteristic:

tion, but their alternating sums are both equal to the Euler
. Indeed, the

Euler-Poincaré Theorem is true independent of the type Bibliographic notes. The matrix algorithm presented in
of coefficients we choose to define homology groups and this section is taken from [2, Chapter 1]. The normal form
Betti numbers. it uses is sometimes referred to as the Smith normal form
[3], and similarly, the algorithm is sometimes called the
Smith normal form algorithm. For integer coefficients, it
Algorithm revisited. The normal form of a bases tran- is unclear whether or not its running time is polynomial
sition matrix is the same as before, except that we now in the input size. However, it is possible to modify the al-
allow entries in the main diagonal that are neither zero

gorithm to guarantee polynomial running time [1, 4]. The
nor one. Specifically, the initial sequence of ones is fol- Betti numbers obtained for and (or other coefficient

lowed by integers , all larger than one, such groups) are not necessarily the same, but their differences
that divides

, for each . We modify the above are predictable and described by the Universal Coefficient
algorithm to transform the incidence matrix into normal Theorem of Homology [2, Chapter 7].
form. First we extend the elementary row and column op-
erations by allowing the multiplication of entire rows or [1] R. K ANNAN AND A. BACHEM . Polynomial algorithms for
columns by non-zero integers. A more substantial mod- computing the Smith and Hermite normal forms of an inte-
ification is needed within the function NON Z ERO, which

ger matrix. SIAM J. Comput. 8 (1979), 499–507.
now attempts to turn the next diagonal entry, , into the
smallest positive entry achievable by row and column op- [2] J. R. M UNKRES . Elements of Algebraic Topology. Addi-
erations. Unless the entire remaining sub-matrix is zero, son-Wesley, Redwood City, California, 1984.
this attempt will be successful and will divide every

[3] H. J. S MITH . On systems of indeterminate equations and
entry in the sub-matrix. To see this property, assume there

congruences. Philos. Trans. 151 (1861), 293–326.
is an entry , with , that is not an integer multiple
[4] A. S TORJOHANN . Near optimal algorithm for computing

of :
Smith normal forms of integer matrices. In “Proc. Internat.
.. .. .. Sympos. Symbol. Algebraic Comput., 1997”, 267–274.
. . .

Exercises 69
Exercises (ii) Assume is the center of

. The sphere

bounding intersects all other balls in caps.

1. Equivalence classes. Consider the following topo- Show that
is isomorphic to the dual com-
logical spaces: a circle, a trefoil knot, a Möbius strip, plex of that collection of caps.
a sphere with north-pole and south-pole removed, 5. Torus and projective plane. Take a rectangular
and a plane with origin removed. piece of paper and orient the left and right sides from
top to bottom and the top and bottom sides from left
(i) Partition the collection into classes of same
to right. You get a torus if you glue the left side to
topological type.
the right side and the top side to the bottom side, each
(ii) Partition the collection into classes of same ho- time with matching orientations. You get a projective
motopy type. plane if you glue again the left to the right and the top
to bottom sides but now with opposing orientations.
2. Amino acids. Take the graphs drawn in Figures I.8
and I.9 as definitions of the amino acids as (one- (i) Triangulate the rectangle such that you get a
dimensional) topological spaces. Here an atom is a valid triangulation for both ways of gluing its
vertex and a bond is an edge, no matter whether or sides.
not it has (partial) double bond character. (ii) Compute the Betti numbers of the torus and
the projective plane by running either the in-
(i) Are there any two amino acids with isomorphic
cremental or the matrix algorithm (by hand) on
graphs? If yes, which ones?
your triangulations.
(ii) Calculate the Betti numbers and Euler charac-
teristics of the graphs. 6. Simple graphs. A simple graph is a simplicial com-
(iii) Partition the collection of graphs into classes of plex that consists of vertices and edges but has not tri-

the same homotopy type. angles or higher-dimensional simplices. Let be the
number of vertices and the number of edges. Use
3. Joins and simplices. A tetrahedron can be defined the language of homology groups to re-confirm the
as the join of two skew line segments in space. The following formulas, which are well-known for sim-
halfway plane is parallel to both line segments and ple graphs:
lies exactly halfway between them. Since the line
(i)

if the graph is a tree.
segments are skew, the halfway plane separates the
(ii)

if the graph is connected.

two line segments.
(iii) in general.

(i) Show that the halfway plane intersects the tetra-
hedron in a parallelogram.
(ii) Decomposing the line segments into and

7. Protein structure. Download a protein structure
from the pdb database and use the Alpha Shape soft-

pieces implies a decomposition of the tetrahe-
dron into joins, which are smaller tetrahedra.
ware to compute the Betti numbers of its van der
Waals and its solvent accessible diagrams.
Draw the decomposition and highlight the in-
tersection with the halfway plane.

4. Stars and links. Let

be the dual complex of a

finite collection of balls in . Define the star of
a vertex as the collection of simplices that
contain , and the link as the collection of faces of
simplices in the star that do not belong to the star:

(i) Show that
is a complex, that is, every face
of a simplex in the link also belongs to the link.
70 IV C ONNECTIVITY
Chapter V
Shape Features
The topological analysis of spaces, as discussed in this idea seems simple enough, the details are tricky and
Chapter IV, is an important first step, but by itself is in- require that we use what we learned about pockets and
sufficient to appropriately characterize the shape of pro- topological persistence. Finally, in Section V.4, we illus-
tein structures. To decide what is appropriate, we need to trate the concepts using the Alpha Shape software and ex-
have a purpose. The goal we have in mind is understand- tensions.
ing how proteins interact with each other and with other
molecules. There is overwhelming evidence that interest-
ing events in such interactions happen preferably in cavi-
ties, which are partially protected regions in the protein or V.1 Pockets
molecular assembly, and that local shape complementar- V.2 Topological Persistence
ity plays a significant role in making such events happen. V.3 Molecular Interfaces
It appears that organic life is based on computations per- V.4 Software for Shape Features
formed by dynamically matching the (changing) pieces of Exercises
a three-dimensional puzzle. A statement like this needs
to be accompanied by a series disclaimers: not every in-
teraction is based on shape complementarity; interactions
that are based on shape complementarity are not entirely
so; and the relevant shape complementarity is local and
imperfect. In other words, the situation is hopelessly com-
plicated.
Our goal in this chapter is to introduce mathematical
and computational methods that allow us to start talking
about the real problem in more precise terms. We do this
be introducing three essentially new concepts. In Section
V.1, we make an attempt to give a precise meaning to cav-
ities in proteins. The main idea here is to combine the
topological concept of a hole with a minimum amount of
geometric information, and this information is the evolu-
tion of the shape under growth. In Section V.2, we return
to homology groups and introduce the concept of topo-
logical persistence. It is a measure of how important a
topological feature is during the evolution. We see this as
a tool to cope with imperfections as it permits us to distin-
guish topological features from topological noise. In Sec-
tion V.3, we make an attempt to give a precise meaning to
interfaces between interacting molecules. We define it as
a two-dimensional sheet separating the molecule. While
71
72 V S HAPE F EATURES
V.1 Pockets from infinity. All we require is that a pocket be wider on

the inside than at possible entrances from the outside. To
In this section, we formalize the idea of a cavity in a pro- make this idea concrete, we grow the space-filling diagram
tein by introducing the concept of a pocket in a space- and observe how it changes: the relatively narrow en-
filling diagram. trances close before the inside disappears. In other words,
a pocket is a maximal portion of space outside the space-
filling diagram that turns into a void before it is subsumed
Voids. The simplest type of pocket is a void, which we by the growing diagram. To formalize this intuition, we
define as a bounded connected component of the comple- need to settle on a growth model. It is convenient to use

ment. Suppose, for example, that is a finite collection of

the one that gave rise to the sequence of alpha complexes,
closed balls in and is the space-filling repre- but we should keep in mind that this choice does affect

sentation of a molecule. Since is finite, the balls cannot what we do and do not call a pocket. According to this

the entire space, which implies that the complement,
cover
,C
model, the center of the ball in remains
, consists of one or more connected components.
Exactly on component is unbounded (infinitely large), and
,
fixed and the radius at time
root of
is equal to the square
. We may think of the growth as pushing the
all other components are voids. See Figure V.1 for an il- points on the boundary of the space-filling diagram out-
lustration of the definition in two dimensions. Recall that wards, in the direction normal to the surface. Figure V.2
illustrates this view in two dimensions. In the interior of
Figure V.1: The union of disks has a single (shaded) void. The
corresponding void in the dual complex consists of five triangles.
Figure V.2: The growing disks push the points on the boundary

in Chapter II, we described a deformation retraction from
the space-filling diagram, , to the dual complex, . The
outwards, in normal direction. Following the vectors, the points
in the shaded region have paths that end at Voronoi vertices.

plain existence of that retraction implies that for each void
in we have a void in that contains the void in . In- the Voronoi cells, the vector field is defined by the sweep-
deed, we can reverse the deformation retraction to show ing spheres. We extend it to the rest of space by using
that the two voids have the same homotopy type. Since the circles that sweep out the Voronoi polygons and the

the dual complex is a subcomplex of the Delaunay trian- intervals that sweep out the Voronoi edges. Starting at a

gulation, we may think of each void in as a collection point outside the space-filling diagram, we follow vectors
of tetrahedra, . The boundary is a col-
and thus form a path that may or may not go to infinity.

lection of triangles in . This collection bounds in but
*
We define a pocket

as a connected component of the set of

not in . It follows that represents a homology class in
the second homology group of . Indeed, the boundaries
points whose paths do not go to infinity. The
points that flow to infinity form a single component, which

of the voids form a basis of that homology group. Hence,
is the number of voids in , which is the same as the
we refer to as the outside. Each pocket is open where it
borders the space-filling diagram and closed where it bor-

number of voids in . ders the outside. The latter set of points may formally be
defined as the intersection of the pocket with the closure
of the outside. Its connected components are open two-
Definition of pockets. A pocket generalizes the concept dimensional sets, which we refer to as the mouths of the
of a void by relaxing the requirement it be disconnected pocket. Note that voids are pockets without mouths.
V.1 Pockets 73
Evolution of dual complex. Similar to voids, we may

associate a pocket of the space-filling diagram with a
pocket of the dual complex. The latter is defined com-
binatorially, again by observing how the space-filling di-
agram changes as it grows. The dual complex changes
only at discrete moment, namely when the space-filling
diagram encounters a new vertex, edge, polygon or cell of

the Voronoi diagram. There are ten cases distinguished by
the dimension of the dual Delaunay simplex, , and the C2

relative position of its orthocenter, . We recall that is
the point at which the affine hull of intersects the affine M2 C2
hull of its dual in the Voronoi diagram.

Figure V.4: The thin solid lines represent polygons that meet
along a common edge in space. That edge appears as a solid dot,
Case M : is a vertex and the orthocenter lies

which marks the orthocenter of the triangle. From left to right,
,
in the interior of the corresponding Voronoi cell, .
the orthocenter lies inside the triangle, lies outside and sees one
This cell is encountered at time , which is the
edge, lies outside and sees two edges and their shared vertex.
moment when the -th ball changes from imaginary

to real radius.

Case is an edge and lies in the interior of the cor- Case C :

. Here we have two sub-cases de-
pending on whether sees one or two edges
responding Voronoi polygon. There are two generic
sub-cases, both illustrated in Figure V.3. from the outside. In the first case, the three
balls touch the Voronoi edge at the same mo-
ment they encounter the Voronoi polygon dual
to the visible edge. In the second case, the balls
touch the edge at the same moment they en-
counter the two polygons and one cell dual to
the two visible edges and the vertex they share.
M1 C1

Case is a tetrahedron. Its orthocenter is necessarily the
corresponding Voronoi vertex.
Figure V.3: The vertical lines are side views of polygons in Case M :

. The four balls completely sur-

space. The solid dot marks the orthocenter of the Delaunay edge. round the Voronoi vertex before they reach it.
On the left, this edge intersects its dual Voronoi polygon, while Case C : . Here we have three sub-cases de-
on the right, it lies on ones side of the polygon. pending on whether sees one, two or three tri-
angles from the outside. The four balls touch
Case M :

. The two balls approach the
the Voronoi vertex at the same moment they
touch the Voronoi edges, polygons and cells
Voronoi polygon from both sides, eventually

that correspond to the triangles, edges and ver-
touching it at . tices visible from .
Case C : . The two balls approach the poly-

gon from the same side. At the moment they In Case C and in the last sub-case each of Cases C

touch, the smaller ball breaks through the outer and C , sees a vertex of from the outside. Assuming
sphere and starts sweeping out the Voronoi cell lies outside the space-filling diagram, this is only pos-

on the other side of the polygon. sible if the ball centered at that vertex is contained inside

Case is a triangle and lies in the interior of the corre-
the union of the balls centered at the other vertices of .
This is unlikely to happen for molecular data and usually
sponding Voronoi edge. There are three generic sub- indicates a measurement or modeling mistake.
cases, all illustrated in Figure V.4.
Case M :

. The three balls completely sur- Metamorphoses and collapses. In four of the ten cases,
round the Voronoi edge before they touch at . only one simplex is added to the dual complex, namely in

Cases M , M , M and M . Consistent with the discussion the flow along normal vectors. We are only interested in

tetrahedra. As noted in Case C , if the orthocenter of

in Chapter III, we call these operations metamorphoses,
since they change the homotopy type. We will see shortly a Delaunay tetrahedron lies outside then it sees ei-

that the remaining six cases do not affect the homotopy ther one, two or three of the triangles. For each triangle
type. They can be understood as inverses of the six types visible from , we define , where is the tetrahe-
of collapses illustrated in Figure V.5. Recall that a princi- dron on the other side of the shared triangle. To cover the
case in which the triangle lies on the boundary of the De-
launay triangulation, we introduce a dummy tetrahedron,
, that represents the space outside the triangulation. By
definition, its orthocenter is at infinity, so can only be a
successor but not a predecessor of other tetrahedra. This
is what we call a sink of the relation. The other sinks are
23−collapse 13−collapse 03−collapse the tetrahedra that contain their orthocenters; they define
metamorphoses in the evolution of the dual complex.

Note that implies that the square radius of the or-

thosphere of is less than that of the orthosphere of . If
, this is true because the orthoradius of is infinity,
12−collapse 02−collapse

by definition. If and are both (finite) Delaunay tetra-

hedra, this is true because their orthocenters are Voronoi

01−collapse vertices that lie on the same side of the plane separating
and . As illustrated in Figure V.6, the two orthospheres

Figure V.5: From left to right, top to bottom: collapsing a tetra- intersect in a circle that lies in the separating plane and the

hedron from a triangle, an edge and a vertex, collapsing a triangle orthocenter of is further from that plane than the ortho-

from an edge and a vertex, and collapsing an edge from a vertex. center of . This implies that the square radius increases
In each case, the collapse removes the tetrahedron, the transpar- along every chain of the relation. Hence, is acyclic and
ent triangles, the dashed edges, and the dotted vertices, if any. its transitive closure is transitive.

pal simplex is not face of any other simplex in the com-

plex. A proper face of a principal simplex is free if all
simplices that contain are faces of . Such a pair

defines a collapse, which is the operation that removes all

simplices between and including and . Formally, the

complex obtained from by collapsing the pair

is

. It is convenient to specify

the type using the dimensions
and to talk about -collapses, for

and
. With

this notation, the changes in the dual complex described

in Case C are caused by inverses of -collapses, for
. Figure V.6: Think of the triangles as projections of tetrahedra
and the circles of projections of spheres. The centers of both
Each collapse can be realized as a deformation retrac-
(dotted) orthospheres lie on the right of the separating plane.
tion that pushes a portion of ’s boundary through to-

ward the remaining portion of the boundary. In the pro-

cess, the retraction removes and all faces of that con-
tain . Being a deformation retraction, the operation does Pockets of dual complex. We are now ready to define
not affect the homotopy type of the complex, and neither and compute the pockets of the dual complex using the

does its inverse. partial order over the tetrahedra. The ancestor set of a
tetrahedron contains , its predecessors, the
predecessors of the predecessors, and so on:
Partial order. Using the classification into ten different
operations, we may introduce a partial order on the De-

launay simplices, which we think of as a discretization of

V.1 Pockets 75
We have seen that a tetrahedron can have more than one complex. Based on this adjacency information, we can
successor. It is also possible that it belongs to more than compute the connected components using standard graph
one ancestor set, although this is not the common case. algorithms, such as depth-first search or union-find. Com-
The pockets in the dual complex are defined by the tetra- puting mouths is similar to computing pockets, only one
hedra that neither belong to the dual complex nor to the dimension lower.

ancestor set of . Note that this is more conservative than
collecting all tetrahedra outside that belong to ancestor Step 1. Collect the boundary triangles not in
.
sets of finite sinks. We compute the pockets in two steps:

Step 2. Partition this collection into components.
Step 1. Collect the tetrahedra in .
We may do the computation for individual pockets or for
Step 2. Partition this collection into components.

all pockets at once. In Step 1, we collect the triangles
in that belong to exactly one pocket tetrahedron.
To collect the tetrahedra, we assume the Delaunay sim-
plices are given in a list ordered by birth-time. As il-
In Step 2, we call two triangles adjacent if they share
an edge does not belong to . Finally, we use the same
lustrated in Figure V.7, the relation over the tetrahedra standard graph algorithms to compute components.
is acyclic and goes monotonically from left to right. We
Bibliographic notes. The importance of cavities in drug

design and discovery has been known for a while [4].
The formalization as pockets introduced in this section has
ω been described in [3] and implemented as part of the Alpha
K Shapes software. The definition of a pocket is not purely
topological and requires a crucial geometric component,
Figure V.7: Ordered list of simplices with relation over the tetra- namely the growth model of the input balls. This growth
hedra indicated by arrows. model forms the basis of the partial order over the Delau-
nay tetrahedra. An extension to include simplices of all
mark the tetrahedra in the dual complex, which form a pre- dimensions has been used for reconstructing the surface
fix of the sub-list of tetrahedra. Next, we mark the tetra- of scanned point sets [2] and might have further applica-
hedra in the ancestor set of by searching backward from tions in the analysis of protein shape.
along the pairs of the relation. To complete Step 1,
we now collect all unmarked tetrahedra in a single scan In everyday language we barely make any difference
through the list. See Figure V.8 for a two-dimensional il- between pockets and other holes, such as the ones counted
lustration. The resulting collection contains the tetrahe- by the Betti numbers. This has also been noticed by the
philosophers Casati and Varzi [1], who introduce a con-
cept they call a hollow which is similar at least in spirit to
our formal notion of a normal pocket.
[1] R. C ASATI AND A. C. VARZI . Holes and Other Superfi-

cialities. MIT Press, Cambridge, Massachusetts, 1994.
[2] H. E DELSBRUNNER . Surface reconstruction by wrapping

finite sets in space. Discrete and Computational Geome-
try — The Goodman-Pollack Festschrift, eds. B. Aronov, S.
Basu, J. Pach and M. Sharir, Springer-Verlag, Berlin, to ap-
pear.
[3] H. E DELSBRUNNER , M. A. FACELLO AND J. L IANG . On

Figure V.8: The eight disks form one pocket, which connects to the definition and the construction of pockets in macro-
the outside along one mouth. The corresponding pocket in the molecules. Discrete Appl. Math. 88 (1998), 83–102.
dual complex consists of four triangles and a single mouth edge.
[4] I. D. K UNTZ . Structure-based strategies for drug design and
discovery. Science 257 (1992), 1078–1082.
dra of all pockets. Call two tetrahedra in this collection
are adjacent if they share a triangle that is not in the dual
V.2 Topological Persistence it destroys if its addition decreases

. Consider the
evolving two-dimensional space illustrated in Figure V.9
In this section, we measure the life-time or persistence of a as an example. There are three events at which homol-
topological feature in an evolving topological space. The ogy classes are created, namely when

the two components
measure can be used to distinguish between pockets with get born at the points labeled M and when the compo-
relatively wide and narrow entrances and they are essential nents merge the second time at the upper point labeled
in the definition of molecular interfaces discussed in the M . The labels indicate the types of metamorphoses that

next section. correspond to the topological changes. When the compo-
nents merge the first time, a component gets destroyed,
and when the hole gets filled, a 1-cycle gets destroyed. It
The intuition. A prime example of an evolving topolog- should be clear that M destroys what the upper M created,

ical space is a space-filling diagram that grows in the way and that the lower M destroys what the right M created.

discussed in the preceding

section. As before, we write Nobody destroys the component created by the left M .

for the corresponding filtration. The are the complexes Incremental algorithm revisited. We will formalize
that arise during the evolution and, in the generic case, any the idea of pairing creations with destructions by revisiting
two contiguous complexes differ either by a metamorpho- the incremental algorithm for Betti numbers presented in
sis or an anti-collapse. Each anti-collapse may be viewed Section IV.3. We study the algorithm in terms of matrices
as a sequence of metamorphoses in which the later sim- of boundary homomorphisms. Recall that a single step in
plices destroy the topological features created by the ear-

that algorithm computes the Betti numbers of a complex

.

lier simplices. For example, a 23-collapse consists of a from the Betti numbers of

triangle creating a void and a tetrahedron filling the same. Let the dimension of be . The only matri-
The life-time of this void is zero because the triangle and

ces affected by adding to the complex are the ones of
the tetrahedron are added at the same moment. We will see
that even if a triangle and a tetrahedron are added at dif-

and of
, which are
displayed in Figure V.10. The new column of the matrix
ferent moments, it is possible to decide in an unambiguous
manner whether or not the tetrahedron destroys what the
triangle created. If it does, then we are talking about a void Ck C k −1
with positive life-time, and we may interpret that life-time
as a measure of significance of the void. We may also in-
terpret it as a shape measure of the corresponding pocket. C k +1 0 Ck
Figure V.10: The addition of to the complex appends a col-

umn to the matrix of and a row to the matrix of .
M1

of is zero because is not a face of any -

M2
M0 M0
simplex in . Hence , the rank of the -th boundary
group, is the same for as it is for . On the other
M1
hand,
may remain the same or it may increase.

Case
creates. Then
belongs to a -cycle, which
Figure V.9: The region grows from two vertices, the two com- implies that its row in the matrix of can be ze-
ponents merge twice, and the second merge creates a void that roed out. We can thus write the Betti numbers of
eventually disappears.

in terms of the ranks of various groups defined for
as follows:

The idea of creation and destruction is the same as in

Section IV.3 and depends on the effect on the Betti num-
bers: a -simplex creates if its addition increases and

V.2 Topological Persistence 77

In words, the -st Betti number remains un- index of the row, among the first rows, for which is
Case
changed and the -th Betti number increases by one. the last column. It returns zero if the row is not defined.

destroys. Then does not belong to a -cycle.

boolean DOES C REATE int

Its row in the matrix of can therefore not be zeroed
out and we get a new non-zero entry in the normal
form of that matrix. Hence,
while
if
L AST C OL
ROW

then row

do
row row

else return FALSE

endif
endwhile;

return TRUE.

In words, the -st Betti number decreases by
one and the -th Betti number remains unchanged. After running Function DOES C REATE for the -th row, that

row is either zero, in which case the corresponding sim-

The case analysis confirms that the incremental algorithm plex creates, or it has a unique last column, in which
as described in Section IV.3 computes the Betti numbers case destroys.
correctly.
Persistent homology. We argue below that Function

Recognizing creations. Besides re-proving the correct- DOES C REATE computes more than just Betti numbers:
ness of the incremental algorithm, the above analysis it also determines how long a homological feature lasts
points the way to an alternative procedure for distinguish- along the filtration. To make this precise, we return to the
ing creating from destroying simplices. Instead of a union- situation in which the filtration represents meaningful in-
find data structure, we use elementary row operations, formation, such as scale in the case of alpha shapes. In
which are slower but more general. Since we only use

general, we define persistence so it depends on the time

row operations, columns in the matrix of correspond to when simplices are added to the complex in the filtration,

individual
-simplices and rows represent - but to simplify matters here, we re-define time equal to
cycles. When we add , we attempt to zero out its row the index. In other words, we say is added at time .
from right to left. To describe how this is done, we call
the column of the rightmost non-zero entry in a row its
Keeping this convention in mind, we now define the -
persistent -th homology group of

as the cycle group
last column, and we assume a function L AST C OL that re- divided by the boundary group at positions later in

turns the index of the last column; it returns zero if that the filtration:
last column does not exist. Clearly, each row has at most
one last column. Conversely, we maintain inductively that
each column is last for at most one row. For example, this
property is satisfied by the matrix in Figure V.11 before Taking the intersection of the boundary group with the
the shaded last row is added. After that addition, we use cycle group is necessary for technical reasons to define
row operations to reinstate the property before adding the
next row. To explain the algorithm, we let be the index of
the quotient group. Figure V.12 illustrates the difference
Zj
1
1 1 1
1 1 1
1 1 1 1
0 B j+p
Bj
Figure V.11: The shaded rightmost non-zero entries identify last

columns of rows.

the row that corresponds to the new simplex . Given a
Figure V.12: The cycle group and its decompositions into solid
-persistent homology classes and dotted 0-persistent homology
column , we also assume a function ROW that returns the classes.

between the -persistent homology group and the usual We illustrate this property by drawing a right-angled

or 0-persistent homology group. The -persistent -th isosceles triangle below every interval, as shown in Fig-

Betti number is the rank of the -persistent -th homol- ure V.13. Each triangle is closed along the top and left
ogy group: .
-th Betti number of

edges but open along the hypotenuse. The -persistence
is represented by the point

in the index-persistence plane. According to the Interval
Interval property of persistence. We develop an intu- Property, it is the number of right-angled isosceles trian-
itive picture of persistence using the distinction between gles that contain this point.

creating and destroying simplices. Note that the number

of creating -simplices until position in the filtration is
Pairing. The pairing of simplices to obtain intervals sat-

the rank of the cycle group: . Similarly,
isfying the Interval Property is done using Function DOE -
the number of destroying -simplices is the rank

S C REATE explained above. Specifically, each destroying
of the boundary group: . The Betti num-

-simplex corresponds to a non-zero row in the matrix of

ber is the surplus of creating versus destroying simplices:

and is paired with the -simplex that corresponds

. Because Betti numbers are non-negative,
to the last column in that row. Note that this -
the creating -simplices and destroying -simplices

simplex indeed creates, as it witnessed by the cycle repre-
are arranged like opening and closing parentheses in an
sented by the row. The persistence of a pair is the
expression, except that some closing parentheses may be

time-lag between the additions of the two simplices to the

missing at the end. In particular, every prefix contains at

complex in the filtration. In the assumed simplified case
least as many creating -simplices as destroying -
in which is added at time , the persistence is the dif-
simplices. We can therefore pair them up and form vertex

ference between indices: . This is the convention we

disjoint intervals, each starting at the position of a creat-
used to generate Figure IV.16, which shows the persistent

ing -simplex and ending at the position of a destroying
first Betti numbers of the space-filling diagram modeling
-simplex (or extending to infinity if there are no
the gramicidin protein.
destroying simplices left). We use intervals that are closed
to the left and open to the right. The Betti number at posi-

tion is then the number of intervals that contain . Any 6
arbitrary pairing creating vertex disjoint intervals has this 5
4
property for Betti numbers. (Can you prove that?) In con- 3
2
trast, there is exactly one pairing that has the following 1
0
stronger property for persistent Betti numbers:

0
I NTERVAL P ROPERTY. The -persistent -th Betti num- 1000

ber at position is the number of intervals that si- 2000
multaneously contain and

. 3000
4000 9000
8000
7000
5000 6000
5000
4000
3000
6000 2000
1000
0
[ )
Figure V.14: Graph of , the number of tunnels in log-

[ )

[ ) [ )
index scale for gramicidin. The index in the filtration varies from left
to right and the persistence from back to front. Observe the large
triangular plateau, which corresponds to the dominant tunnel that
passes through gramicidin.
persistence
The running time of the pairing algorithm is roughly

the same as that of the normal form algorithm described
in Section IV.4, namely cubic in the number of simplices,
Figure V.13: Each right-angled isosceles triangle in the index- which is at most some constant times . Indeed, Func-
persistence plane represents a non-bounding cycle that persists tion DOES C REATE spends fewer than row operations
over the complexes covered by its interval. per simplex, each taking time at most proportional to .
V.2 Topological Persistence 79
Bibliographic notes. The material for this section is

taken from [1], where we find the definition of persis-
tent Betti numbers, the algorithm and its correctness proof.
The algorithm has been implemented and experimental re-
sults suggest it is considerably faster than the obvious cu-
bic time bound. We should note, however, that the imple-
mentation in [1] differs in two possibly significant aspects
from the algorithm described in this section. First, the im-
plementation uses a union-find data structure to classify
simplices as creating or destroying, and second, it uses a
sparse matrix representation that permits row operations
in time proportional to the number of non-zero entries.
Persistent Betti numbers have been defined independently
by Robins [3], who uses them to study the fractal nature
of two-dimensional point patterns. Persistent homology
groups are embedded in spectral sequences, which are spe-
cial tables of related homology groups [2]. It might be
interesting to explore the other groups in that table and
to find meaningful interpretations in the context of alpha
complexes.
[1] H. E DELSBRUNNER , D. L ETSCHER AND A. Z OMORO -

DIAN . Topological persistence and simplification. Discrete
Comput. Geom. 28 (2002), 511–533.
[2] J. M C C LEARY. A User’s Guide to Spectral Sequences. Sec-

ond edition, Cambridge Univ. Press, England, 2001.
[3] V. ROBINS . Toward computing homology from finite ap-

proximations. Topology Proceedings 24 (1999), 503–532.
V.3 Molecular Interfaces bi-chromatic polygons and their edges and vertices. Fig-
ure V.15 illustrates the definition by showing the interface
The interface between two or more interacting molecules of two collections of disks in the plane.
is the location of that interaction. In this section, we
present a proposal for a surface or complex of surfaces that Local structure. In the generic case, every edge belongs
geometrically represents that interface. One of its applica-

to three and every vertex to four Voronoi cells. This im-
tions is to display functions defined over the interface. plies that for colors, the interface has a particularly
simple local geometric structure. An interface edge be-
longs to two cells of one and to one cell of the other color,
Interfaces without boundary. Our definition of a mo- and exactly two of the three polygons sharing the edge are
lecular interface is a formalization of two intuitions, bi-chromatic and thus belong to the interface. There are
namely that the best separation of two or more molecules two types of interface vertices: those that belong to three
is part of the Voronoi diagram and that the interesting por- cells of one and one cell of the other color and those that
tion of that separation is protected by a relatively tight seal. belong to two cells of each color. As illustrated in Figure
We will come back to the second intuition later and for- V.16, the local neighborhood of both types of vertices is
malize the first intuition now. a topological disk. We conclude that in the generic case
Figure V.16: The shaded polygons and their edges belong to the
interface. On the left, we have three cells of one and one cell of
the other color. On the right, we have two cells of each color.
the interface for colors is a -manifold, which is a

topological space in which every point has an open neigh-
borhood homeomorphic to . By construction, that 2-
manifold is orientable, with the cells of one color on one
side and the cells of the other color on the other side.

Figure V.15: The solid bi-chromatic edges form the interface of

the two collections of disks. The dotted mono-chromatic edges For colors, the local structure of the interface can
show the rest of the Voronoi diagram. be more complicated because we may have tri-chromatic

edges and tri- and four-chromatic vertices. For any two
molecules, each repre-
Consider an assembly of
colors, we get a 2-manifold, but now these 2-manifolds

sented by a collection of balls

in , and let meet along curves formed by tri-chromatic edges. In
be the collection of all balls. Recall that the Voronoi di- other words, the interface is a two-dimensional complex of

agram of consists of a polyhedral cell for each ball sheets, curves and vertices. Every sheet is a maximal com-
and of the polygons, edges and vertices shared ponent consisting of bi-chromatic polygons, edges and
by the cells. We use colors to keep track of the corre- vertices of a given color pair. Similarly, every curve is a
spondence between balls and molecules. Specifically, if maximal component consisting of tri-chromatic edges and
belongs to then we say and

have the color . vertices of a given color triplet. Finally, every interface
The polygons, edges and vertices get their colors from the vertex is a four-chromatic vertex in the Voronoi diagram.
cells they belong to. While all cells are mono-chromatic, a Together, the sheets, curves and vertices form a complex
polygon can be mono-chromatic or bi-chromatic depend- in the sense that the boundary of every sheet consists of
ing on whether the two cells that share the polygon have finitely many pairwise disjoint curves and vertices, and the
the same or different colors. The interface between the boundary of every curve consists of finitely many interface
is the subcomplex of the Voronoi diagram consisting of all vertices.
V.3 Molecular Interfaces 81
Retraction. As defined above, the interface may go to We may think of a retraction as successively removing
infinity, which is sometimes a disadvantage. Our goal here sinks from an acyclic directed graph. It follows that the
is to shrink the interface back to where the molecules are result of the operation is independent of the sequence in
sufficiently close to interact. It seems natural to do this which the collapses are performed.
with a distance threshold, but this would most certainly
lead to the deletion of interior portions and produce frac-
tured surfaces. We therefore shrink from outside in and Clipping. The result of the retraction is the collection of
use relative rather than absolute distance measurements to tetrahedra in the dual complex together with the tetrahe-
decide where to stop the process. In the first step, we re- dra in the pockets. We further remove all mono-chromatic
tract the interface back to the multi-chromatic dual of the tetrahedra and let denote the remaining collection of
dual complex and its pockets. In the second step, we use multi-chromatic tetrahedra. The interface is now obtained
topological persistence to shrink the interface even further. as the dual of . More specifically, for each bi-chromatic
We will return to the second step later. edge of the tetrahedra in , we add the dual polygon to
launay triangulation

To describe the shrinking process, we consider the De-
of the collection of balls . We
the interface. There are, however, complications because
such a bi-chromatic edge may either be completely or only
have mono-chromatic vertices and mono- as well as multi- partially surrounded by tetrahedra in . In the latter case,
chromatic edges, triangles and tetrahedra. The interface we clip the polygon before adding it to the interface. Fig-

as defined above is dual to the subset of multi-chromatic
simplices in . Note that the first step of the shrinking
ure V.17 illustrates this idea in two dimensions, but we
should keep in mind that the situation in three dimensions
process is equivalent to removing all tetrahedra outside the is more complicated. A partially surrounded bi-chromatic
dual complex that belong to the ancestor set of the dummy
tetrahedron, which represents the space outside the Delau-
nay triangulation. We use 23-collapses to remove these
tetrahedra. We simplify the algorithm by ignoring prin-
cipal triangles, edges and vertices; in other words, we

delete principal triangles, edges and vertices as soon as
they arise. Let denote the dual complex.

void C OLLAPSE
:

if and is collapsible then
forall faces do delete endfor
endif.
In this context, we consider

collapsible if the pair is

part of an anti-collapse in the construction of the filtration
and the collapse of and renders the other simplices in
this anti-collapse principal. This is equivalent to saying
that the effect of the 23-collapse is the inverse of that anti- Figure V.17: The triangles drawn with solid edges are the bi-
collapse. We define a retraction as a maximal sequence chromatic triangles constructed by the contraction algorithm.
of collapses. In other words, we collapse as long as we The boldface interface is dual to and clipped at the boundary
can. In the implementation of this operation, we maintain of this collection.
a stack of candidate pairs. Initially, this stack contains all
boundary triangles of the Delaunay triangulation together edge corresponds to a polygon with two types of vertices:
with their incident tetrahedra. During the process, we take those dual to tetrahedra in and the others. We clip the
pairs from the stack and add new pairs whenever we create polygon by cutting each edge connecting vertices of dif-
new boundary triangles by collapsing. ferent types with the plane of the corresponding boundary
triangle. If that plane does not intersect the dual Voronoi
Complex R ETRACT : edge, which happens in rare cases, we clip at the endpoint

while the stack is non-empty do that is closer to the plane. Finally, we connect the cut

P OP; C OLLAPSE points in contiguous pairs and retain the portions of the
endwhile. polygon with vertices of the first type.
Further retraction. We now take the shrinking process on the boundary of the current set . We may start with
beyond the retraction from the dummy tetrahedron. Re- the set of all Delaunay tetrahedra.
call that the topological persistence algorithm of Section

Complex R ETRACT M ORE

V.2 generates simplex pairs with the property that :

destroys what created. The dimension of is one while the stack is non-empty do

larger than that of , but we are only interested in the case
if
P OP ;

then R EMOVE

in which is a triangle and is a tetrahedron. We think endif

of the operation that removes and as a generalization endwhile.
of a 23-collapse, but it is more complicated because is

generally not a face of , although it can be. We do the As before, we get the interface by duality from the com-
operation only if is a boundary triangle of and does puted collection of tetrahedra. The running time is dom-
not belong to the dual complex. We first delete and then inated by the topological persistence algorithm, which
retract from . As before, we remove principal triangles, takes cubic time to form the triangle-tetrahedron pairs.
edges and vertices as soon as they get created. With some care, we can implement the rest of the algo-

rithm so it takes only constant time per simplex in the De-

void R EMOVE :

launay triangulation.
if then delete ;
forall triangles

do P USH endfor; We note that it is possible to use other functions that
satisfy the monotonicity property (V.1). For example, we
R ETRACT
endif. tetrahedra by using

may bias the shrinking process against large triangles and

. A second potential

advantage of this function over the inverse of the persis-

Here, is the tetrahedron that shares with . If the re- tence is that it is dimensionless and thus amenable to the
traction from reaches far enough, gets deleted just be- use of universally meaningful constant thresholds.
cause it becomes principal. However, it can happen that

the retraction does not reach all the way, in which case
we recurse for other pairs of simplices before deleting . Global structure. Note that we may get different in-

This is done implicitly during the retraction. To decide terfaces for different values of the threshold . Since a
whether or not to remove and in the first place, we smaller threshold permits as many or more removals than

compare their persistence with a constant threshold and
remove only if

. Here, ,
. Indeed, if we use

a larger threshold, the interface shrinks with decreasing

, we get a filtration

,
and

for we have

are the moments when and are born. Note that
alpha shapes. For

that is parametrized in a way similar to the sequence of
, the interface is the original sur-

(V.1)
face or complex defined by the set of bi-chromatic Voronoi
polygons. For , the interface is empty, unless the
dual complex of contains bi-chromatic triangles, which

This monotonicity property is important for the correct-
would remain. In this case, we can further decrease the in-

ness of the algorithm because if the retraction from does
terface by making negative, but we have to modify the

not reach then this can only be because there is a triangle
retraction to allow for collapses of simplices in the dual
between and that split the void created by before
complex. Eventually, for , the interface is guar-
it was destroyed by . But then the other part of the void
must have been destroyed by a tetrahedron preceding
in the filtration. In other words,
anteed to be empty.

, where For a fixed , the interface is a two-dimensional com-
and are the moments when and are born. The plex. Its two-dimensional elements are sheets defined

monotonicity guarantees that the simplices between and by bi-chromatic Voronoi polygons. There are two kinds
are removed by recursive deletions so that can even- of one-dimensional elements: the original tri-chromatic
tually be deleted. We now restate the algorithm and sim- curves and the new bi-chromatic curves outlining the sheet
plify its description by declaring a 23-collapse as a special boundary created by shrinking. Finally, there are two
case of a removal. Because of our policy to delete prin- kinds of zero-dimensional elements, namely the original
cipal triangles, edges and vertices, all other collapses can four-chromatic vertices and the new tri-chromatic vertices
be ignored. The algorithm maintains a stack of triangle- forming the curve boundary created by shrinking. We take

tetrahedron pairs formed by the topological persistence al- all sheets and curves as open sets so the complex is a col-

gorithm. Initially, the stack contains all pairs with lection of pairwise disjoint open elements. Note, however,
V.3 Molecular Interfaces 83
that the elements are not necessarily simply connected. To

explore this further, we excise thin strips along the curves
to turn each sheet into a connected 2-manifold with bound-
ary. Each component of the boundary is a closed curve
outlining a hole in the 2-manifold. A classic result in
topology says that two orientable 2-manifolds with bound-
ary are homeomorphic if and only if they have the same
genus and the same number of holes. Furthermore, the
Euler characteristic of a 2-manifold with genus and

holes is

where , and are the number of vertices, edges and
triangles of any arbitrary triangulation of the 2-manifold.
Given a sheet, it is easy to compute its Euler characteristic
genus as
and to determine its number

of holes. We then get the
. We may think of this manifold
as obtained by punching holes into a -fold torus.
Bibliographic notes. The material in this section is

taken from the recent manuscript by Ban et. al [1]. There
is evidence that the geometric interfaces shed new light on
the hot-spot theory of protein-protein interaction [4]. A
competing proposal for a geometric definition of molecu-
lar interfaces can be found in [3], where two independent
real parameters are used to define the interface as a portion
of the molecular surfaces of the two or more molecules.
In topology, 2-manifolds with and without boundary have
been studies for more than a century. The fact that the
topological type of a connected orientable 2-manifold is
determined by the genus and the number of holes can be
found in a number of texts, including [2].
[1] Y.-E. BAN , H. E DELSBRUNNER AND J. RUDOLPH . A defi-

nition of interfaces for protein oligomers. Manuscript, Duke
Univ., Durham, North Carolina, 2002.
[2] W. S. M ASSEY. Algebraic Topology: an Introduction.

[3] A. VARSHNEY, F. P. B ROOKS , J R ., D. C. R ICHARDSON ,

W. V. W RIGHT AND D. M INOCHA . Defining, computing
and visualizing molecular interfaces. In “Proc. IEEE Visu-
alization, 1995”, 36–43.
[4] J. A. W ELLS . Binding in the growth hormone receptor com-

plex. Proc. Natl. Acad. Sci. 93 (1996), 1–6.
V.4 Software for Shape Features

In this section, we explore extensions of the Alpha Shape
software that are concerned with connectivity information
and shape features. We begin with signatures, then pro-
ceed to pockets, and finally look at molecular interfaces.
Betti number signatures. As explained in Section IV.2,

the components, tunnels and voids
of a complex in are
counted by the Betti numbers , and . They are

computed by the algorithm explained in Section IV.3 and
displayed to the right of the correspondingly labeled but-
tons in the signature panel shown in Figure V.19. To the
left of each button we can toggle the display of the evo-
lution of the number as a function of the index in the fil- Figure V.19: The signature panel with the tunnel signature dis-
tration. We refer to these functions as signatures of the played in log-scale. The index 2,354 belongs to the higher of the
two plateaus, which implies that both tunnel systems are open in
data set. As an example consider the zeolite data shown in
the displayed complex.
tunnel signature with filtration index increasing from left

to right and persistence increasing from back to front. The
persistence of the tunnels is formally defined in Section
V.2.
12
10
8
6
4
2
0
0
5000
10000
15000
20000
25000 45000
40000
35000
30000
30000 25000
20000
15000
35000 10000
5000
0
Figure V.20: The graph of , the number of tunnels in

Figure V.18: Three axis-parallel views of the 2,354-th dual com-
plex in the filtration of a periodic zeolite molecule consisting of
log-scale, of the zeolite data. The noise in the signature decreases
1,296 atoms. from back to front. The two persistent tunnel systems are visible
as plateaus that escape the noise removal the longest.
Figure V.18. Two of the three views are taken along tunnel
systems that intersect orthogonally and give rise to a rather
complicated cave system. Note that the tunnels shown in
the second view are smaller in diameter than those shown Displaying pockets. Prior to developing and imple-
in the third view. It follows that there are complexes in the menting pockets, we have experimented with other and
filtration that have the tunnels in the first system closed more simple-minded ideas aimed at getting a handle on
while the tunnels in the second system are still open. The cavities in molecular data. One such idea was to display
two systems can be detected in the tunnel signature shown
in Figure V.19. Figure V.20 shows the two-dimensional

the difference between the Delaunay triangulation and the
dual complex, , or more generally the difference
V.4 Software for Shape Features 85
between two dual complexes,

. This difference closed under the face relation. The software indicates the

can be computed in the Alpha Shape software by first se- presence or absence of boundary triangles by the choice

lecting and and second pushing the ‘Difference’ but- of color. The mouth regions are therefore visually eas-
ton in the scene panel. The results are not encouraging ily identifiable. However, the internal connectivity of the
because a typically large number of inessential simplices pockets is not immediately visible, which may lead to con-
clutters the view of important cavities. In contrast, the fusion. For example, two pockets may appear connected
dual set of a pocket usually gives a clear indication of the but are not because of missing shared triangles. It is pos-
cavity, as in Figure V.21. The interface also supports the sible to visually inspect the connectivity by turning on the
display of simplices of all dimensions in the scene panel,
as shown in Figure II.17, and using the explosion func-
tion to separate all simplices. We observe the same phe-
nomenon for the mouths of a pocket. Two boundary trian-
gles that share a common edge may or may not belong to
the same mouth depending on which shared edges belong
to the pocket.
Pocket panel. Pockets can be computed without open-

ing the pocket panel, but a more detailed exploration re-
quires interaction with the software, which is facilitated
by that panel. A useful feature is the ‘Shapewire’ button,
which can be used to display the edge skeleton of the dual
Figure V.21: All pockets in the dual complex of the zeolite data
complex together with the pockets. The skeleton does not
for index 2,926.
block the view and helps positioning the pockets relative
to the complex. The panel also provides a means to step
display of individual pockets, and Figure V.22 shows the
through the sequence of individual pockets and to select
largest of the pockets in Figure V.21 from a different angle.
pockets by their number of mouths. The main design of
We should keep in mind that the pocket in the dual com-
Figure V.23: Pocket panel of the Alpha Shape software.
the pocket panel, shown in Figure V.23, is similar to that

Figure V.22: Side view of the largest pocket of the collection of the signature panel. It contains a window for its own
shown in Figure V.21.

signatures, which start after the index of the first chosen
complex. The second index, , can be chosen anywhere
plex is geometrically considerably larger than the pocket
in the corresponding space-filling diagram. This effect is

between and the maximum. It is used to eliminate an-

cestor sets of tetrahedra whose indices are larger than or
the reverse of that for the molecule, whose dual complex
equal to . In other words, all tetrahedra , with ,

is considerably smaller than a corresponding space-filling are treated like in the computation of pockets. This elim-
diagram.
ination of large pockets helps in the exploration of detailed
Remember that pockets in the dual complex are not structures, such as side pockets of larger pockets. An ex-
ample is shown in Figure V.24, which shows the pockets fifty-one proteins and their cavity structure. The most in-

filling the system of narrow tunnels visible in the second teresting outcome of that study is perhaps that in about
view in Figure V.18, but with set such that the system 80% of the cases, the pocket with the largest volume is
of wider tunnels visible in the third view of Figure V.18 also the biologically active site of the molecule. In many
are still open. The pockets thus only fill the remains of the instances, the largest pocket is assisted in its function by
narrow tunnels, and as can be seen in the first view, these smaller auxiliary pockets in the vicinity. In another appli-
remains are not connected. cation, Liang and Dill [2] provide numerical evidence that
proteins are packed tighter in the core than near the out-
side. The interface software has been developed by Yih-
En (Andrew) Ban but is not yet complete. It is built on
top of the Alpha Shapes software but requires a variety
of additional features to be useful to biologists. Some of
these features can be seen in visualizations of interfaces
presented in this section.
[1] M. A. FACELLO . Geometric Techniques for Molecular

Shape Analysis. Ph. D. thesis, Dept. Comput. Sci., Univ. Illi-
nois, Urbana, 1996.
[2] J. L IANG AND K. A. D ILL . Are proteins well-packed? Bio-

physics J. 81 (2001), 751–766.
[3] J. L IANG , H. E DELSBRUNNER AND C. W OODWARD .

Anatomy of protein pockets and cavities: measurement of
binding site geometry and implications for ligand binding.
Protein Science 7 (1998), 1884–1897.
[4] A. Z OMORODIAN . Analyzing and Comprehending the

Topology of Spaces and Morse Functions. Ph. D. thesis,
Figure V.24: Three axis-parallel views of the pockets represent- Dept. Comput. Sci., Univ. Illinois, Urbana, 2001.
ing the narrow tunnel system decomposed into pieces by opening
up the wide tunnel system. Both systems are shown as holes in
Figure V.18.
Displaying interfaces. [The input is a complexed collec-

tion of proteins.] [Mention the issue of water molecules,
which we remove for simplicity.] [Talk about the weighted
square distance function over the interface.] [Show one
figure with iso-lines of that function.]
A human growth hormone example. [Say a few works

about the particular two proteins.] [Show the sequence of
figures illustrating the interface filtration.]
Bibliographic notes. The persistence software has been

developed by Afra Zomorodian and is described in his dis-
sertation [4]. It is currently not part of the Alpha Shape
software. The pocket software has been developed by
Michael Facello and is described in his dissertation [1].
Using this software, Liang and collaborators [3] studied
Exercises 87
Exercises (ii) Following your definition, can the Euler char-

acteristic of a void be any integer or are there
restrictions?

1. Gabriel graph. Let be a finite set of points in .
The Gabriel graph of consists of all edges
which
for 6. Paired parentheses. Consider a sequence of @
*IF F *KF
parenthesis of a well-formed expression, such as for
F F F

example . A pairing is a perfect matching
for all points *+

.
between the opening and closing parentheses such
that the opening parenthesis precedes the closing
parenthesis in every pair. Each parenthesis has an
(i) Prove that all edges in the Gabriel graph belong integer position in the sequence, and the length of a
to the Delaunay triangulation of . pair is position of the closing minus the position of
(ii) Prove that the Gabriel graph is connected. the opening parenthesis.

2. Ancestor sets in the plane. Consider the Delaunay

(i) Given a pairing, let be the sum of lengths of
triangulation of a finite points set in . Write the pairs. Prove that .

if the two Delaunay triangles share an edge and both (ii) Prove that depends on the given sequence but
orthocenters lie on ’s side of that edge. not on the pairing.
4
(i) Prove that is a partial order.
(ii) Prove that the ancestor sets of any two different 4
7. Sperner’s Lemma. Let
be a triangle and a
* * *
triangulation of . The label of a vertex in that
sinks in the order are disjoint.
4 4
lies on the edge is either or , for every

(iii) Explain how the Gabriel graph relates to the an-
cestor sets of the sinks. of

4

4
, and the label of a vertex in the interior
is either , or .
3. Collapsible complexes. Recall that a contractible (i) Prove that there exists at least one triangle in

topological space has the homotopy type of a point. whose vertices have three different labels.
We call a simplicial complex collapsible if there is a (ii) Strengthen the result in (i) by proving that the
tex. Clearly, if

sequence of collapses that reduces it to a single ver-
is collapsible then its underlying
number of triangles with three different labels
is odd.

space is contractible.
(iii) What would be a natural generalization of these
results from a triangle to a tetrahedron?
(i) Prove that if is embedded in then
is collapsible iff its underlying space is con-
8. 2-manifolds. Recall that a 2-manifold is a topologi-
tractible.

cal space in which every point has an open neighbor-

(ii) Give an example of a simplicial complex em- hood homeomorphic to .
bedded in that is not collapsible but whose
(i) Show that a two-dimensional simplicial com-

underlying space is contractible.
plex in which every edge belongs to exactly two
4. Barycentric subdivision.
complex and let
be a simplicial
denote its barycentric subdi-
Let triangles is not necessarily a 2-manifold.
(ii) Show that a simplicial complex in which the

vision.

closed star of every vertex is the triangulation

(i) Show that each -simplex in gives rise to of a disk is necessarily a 2-manifold.
-simplices in , for .

(ii) Prove that the Euler characteristic of
are the same.
and
5. Connectivity of voids. A void of a space-filling dia-

gram is by definition connected but can have handles
and islands.
(i) How would you define the Betti numbers of a
void?
Chapter VI
Density Maps
Morse theory grew out of the study of the variational

methods in analysis. The initial interest focused on high-
and possibly infinite-dimensional settings. In this chapter,
we introduce Morse theory with an emphasis on the two-
and three-dimensional cases. Possibly the best known re-
sult in Morse theory is the relation between the critical
points of a smooth real-valued function over a manifold
and the Euler characteristic of that manifold. Because of
this relation, Morse theory is sometimes also referred to as
critical point theory.
We use two sections to introduce the basic setting of
Morse theory and one to explain the concept of molecu-
lar pockets in Morse theoretic terms. In the second sec-
tion, we make an effort to relate the Morse theoretic con-
cepts with the discussion on connectivity. While Morse
theory requires differentiable spaces and thus seems to
be built on rather specialized assumptions, we will see
that many themes are familiar from Chapter IV. In some
ways, Morse theory is but a different language or frame-
work to talk about connectivity. The differentiability as-
sumption allows the introduction of otherwise undefined
concepts. Together with suitable non-degeneracy assump-
tions, it brings order into the complicated world of ge-
ometric form. [The material will have to be partially re-
arranged according to the following plan of sections:]
VI.1 Morse Funcitons

VI.2 Critical Points
VI.3 Morse-Smale Complexes
VI.4 Jacobian Submanifolds
Exercises
89
90 VI D ENSITY M APS
VI.1 Smooth vs. Piecewise Linear of

do not belong to . For
we have empty
boundary, , so attaching a point or 0-cell is the
A Morse function is a smooth real-valued map over a man- same as taking the disjoint union.
ifold that satisfies certain non-degeneracy assumptions.
This section introduces Morse functions as a crucial piece
Smooth manifolds. In order to relate the topological
in the basic mathematical framework of Morse theory.
type to differential properties, we need to restrict ourselves
to sets for which such properties are defined. We need
Sweeping a torus. Morse theory talks about manifolds some basic definitions from differential geometry to ex-

press these restrictions.

and smooth functions over these manifolds. The primary
goal is to find out about the topological type of the mani-

A map from an open set to another open set

folds through a differential analysis of the functions. The is smooth if the partial derivatives of all orders
standard introductory example is the torus embedded

exist and are continuous. For general and not necessarily
is defined by
in upright position in and the height function this em-
*
open sets and , the map
* * *
bedding defines. Formally, is
smooth if for every
*
there that
exists an open set
to its distance from the -
4;
mapping each point

containing and a smooth map

4
plane. For each , we consider the set of points with coincides with throughout . Note that the com-
height less than or equal to ,
* * 4
position of two smooth maps is smooth. A diffeomor-

phism is a smooth homeomorphism whose
inverse is also smooth, and two spaces are diffeomorphic
As illustrated in Figure VI.1,

if there is a diffeomorphism between them. A subset
4
changes its topology

*
only at certain critical values of . is a smooth manifold of dimension if
has a neighborhood

each that is diffeo-

morphic to an open subset . A particular diffeo-
is called a parametrization
s

h ( s)
is called
morphism
r
, and its inverse

h ( r) of
. As
* F *IF
an example we may
a coordinate system on
consider the 2-sphere
h (q)
q

cover with six open hemispheres defined by
. We can
*
p for . As shown in Figure VI.2, each hemisphere
h ( p)
*
can be parametrized by orthogonal projection to one of the
attach attach attach attach coordinate planes. For a point , we can construct
0-cell 1-cell 1-cell 2-cell
0
Figure VI.1: Evolution of the torus in the sweep from bottom to

top and the corresponding construction by attaching a 0-cell, two
1-cells, and a 2-cell.
* HF *KF
It is instructive to look at the evolution of the homotopy
type of
homeomorphic to
. A -cell, , is a space
the -dimensional ball, . Each
time the homotopy type of changes, we can interpret
this event as attaching a cell of some dimension. The evo-

lution of the torus during the sweep and the interpretation Figure VI.2: The upper open hemisphere is parametrized by pro-
of attaching cells is illustrated in Figure VI.1. To define jection to the -plane.

what attaching a cell exactly means, note that the bound-

is a -sphere, . The attachment of
* *
ary of a -dimensional hyperplane in that best approximates
to a space requires a continuous map ,

near . The tangent space at is the
hyperplane / -dimensional

which we refer to as the gluing map. Then with at- through the origin of that is parallel
*
tached by is the space
* /
obtained by identifying to this best approximating hyperplane. The elements of

every points with . All interior points the vector space are called tangent vectors to at
VI.1 Smooth vs. Piecewise Linear 91
* . Note that for every smooth curve passing

through * , the tangent vector 1 * is a tangent vector and
tangent space of . The index is then the number of eigen-
thus an element of / .
vector directions along which decreases. For example,
the indices of the critical points , , , and in Figure
VI.1 are 0, 1, 1, and 2. This fact is also expressed in the
lemma of Morse.
4
Critical points. The homotopy type of the partial torus

changes when passes the height value of the points
, , , and marked in Figure VI.1. These are the points
. There is a
M ORSE L EMMA . Let be a non-degenerate critical point

with index of
with horizontal tangent planes. Assuming a local coordi-
nate system in a neighborhood, a point is a critical
neighborhood of and a local coordinate system
in with

for all and

point of if all derivatives vanish,

*

If is a critical point then is a critical value. Non- throughout .
critical points and non-critical values are also referred to
as regular points and regular values. Note that the dimensions of the cells attached to the evolv-
ing torus in Figure VI.1 are equal to the indices of the
Just like the first derivative can be used to compute
corresponding critical points. This is generally the case

the best linear approximation to , the second derivative
because a critical point with index connects
*
can be used to compute the best quadratic approximation.
Specifically, the Hessian of at is the matrix of to the past along directions. These directions span a -
dimensional cell needed to realize the connections.
second derivatives,
#
*

* * * %

Degenerate critical points. A 1-dimensional manifold

is a closed curve. A connected open subset is an open in-

A critical point is non-degenerate if
terval, which is homeomorphic to . Consider the height
singular, that is,

is non-
. Non-degenerate critical
function

defined by

. The * *
derivative vanishes at 0. The second derivative vanishes
points are isolated, which means there is an open neigh-
too, , which identifies 0 as a degenerate crit-
borhood without other critical points. We call a Morse
ical point. Geometrically, the degeneracy is manifested
function if all critical points are non-degenerate.
by the fact that an arbitrarily small perturbation can re-
A quadratic function in two variables has only three move the critical point or turn it into two non-degenerate
types of critical points, maxima, saddles, and minima. The ones, a maximum and a minimum. Figure VI.3 illustrates

* :* * *
origin is a critical point for every possible assignment of
the instability of the degenerate critical point. A simi-
signs to

, and it is a maximum for

, a saddle for or , and a minimum for .

The saddle is the most interesting case of the three because
a circle drawn around it has two peaks alternating with two
pits. In contrast, a circle drawn around a regular point has
only one peak and one pit. Critical points with small cir-
cles that oscillate more often than twice are necessarily
degenerate.
Index. The Hessian is symmetric and we can compute

, where is the
Figure VI.3: From left to right, graphs of the function
its eigenvalues, for , , and

. Critical points are marked.

dimension of the manifold . Assuming the Hessian is The middle function has a degenerate critical point at 0, which is
non-singular, all eigenvalues are non-zero. The index of unfolded in different ways by the other two functions.
at a non-degenerate critical point is the number of neg-
ative eigenvalues and is denoted as . Recall that the lar degenerate critical point exists for the monkey saddle
eigenvectors define an orthogonal coordinate system in the shown in Figure VI.4. It may be specified as the graph
* :*

* @* * , which is the real part of
For example for the torus we get

* * . As we go around a circle centered at the ori-

of

. In words, for every minimum and maximum we get

gin, the function
and three pits at
point is

has three peaks at , , and ,
, , and . The only critical
. The matrix of second derivatives at
Morse function we use. For the sphere we get
exactly one (non-degenerate) saddle point, no matter what

. This implies that every Morse function of the

that point is sphere has at least two (non-degenerate) critical points. A

* * minimum example is the ordinary height function, which

*

* has a minimum at the south-pole and a maximum at the
north-pole.
which is zero at 0.
Bibliographic notes. The original development of
Morse theory from its variational background is described
by Morse [3] and by Seifert and Threlfall [4]. Milnor’s
later book [2] emphasizes the topological analysis of man-
ifolds and has since become a standard reference in Morse
theory. Good introductory texts to the related subject of
differential topology are the books by Guillemin and Pol-
lack [1] and by Wallace [6]. A good introduction to lin-
ear algebra including an intuitive discussion of eigenval-
ues and eigenvectors is the book by Strang [5].
[1] V. G UILLEMIN AND A. P OLLACK . Differential Topology.

Prentice-Hall, Englewood Cliffs, New Jersey, 1974.
[2] J. M ILNOR . Morse Theory. Princeton Univ. Press, New Jer-

sey, 1963.
[3] M. M ORSE . The Calculus of Variations in the Large. Amer.

Math. Soc., New York, 1934.
Figure VI.4: Monkey saddle with degenerate critical point.
[4] H. S EIFERT AND W. T HRELFALL . Variationsrechnen im
Großen. Published in the United States by Chelsea, New
All critical points in the above examples are isolated,
York, 1951.
* :* * *
but there are others that are not. For example, for

the entire -axis is critical, but none of [5] G. S TRANG . Introduction to Linear Algebra. Wellesley-

its points are isolated. Similarly, if we lay down the torus Cambridge Press, Wellesley, Massachusetts, 1993.
on its side, the height function has a circle of minima and
[6] A. WALLACE . Differential Topology. First Steps. Benjamin,
another circle of maxima.
New York, 1968.
Euler characteristic. Let be a compact and smooth

manifold without boundary and a Morse func-

tions. We will see in Section VI.2 that we can construct a
-cell for each index- critical point so that can be con-

structed by successive attachment of these cells. Let be
the number of critical points of index . As always, the
Euler characteristic is the alternating sum of cells, which
is also the alternating sum of critical points,

.

VI.2 Morse-Smale Complexes 93
VI.2 Morse-Smale Complexes joint or the same. Every maximal integral line is open at

both ends and thus a map of an open interval or, equiva-
lently, of the real line, . It approaches two

In this section, we introduce the gradient of a Morse func-
tion and use it to construct the -cells whose inductive at-

tion, 1
,
critical points, which we refer to as its origin and destina-

and

. 1 ,
4
tachment reproduces the evolution of the homotopy type
of , for continuously increasing real threshold . It is convenient to consider each critical point as an inte-
gral line by itself so that the collection of integral lines
partitions . The stable manifold of a critical point is
Gradient flow. The gradient of a linear map *
4 * is the vector

4 4
. It is the 4 the union of integral lines with destination and, symmet-
rically, the unstable manifold is the union of integral lines
projection of a normal vector of the graph of and points with origin ,
.
in the direction of the steepest ascent. The same concept
.

*
can also be defined for a Morse function
/ /8 /
Assuming an orthonormal local coordinate system at ,
the gradient of is
, same

as for linear maps. We can define it also without refer-

* * /
ence to a coordinate system. A vector field, , maps

every point to a tangent vector
. The stable manifold of a minimum is the minimum itself.
The gradient is the particular vector field that satisfies The stable manifold of a saddle is an open curve, which is

, for every vector field , where

the union of two integral lines and the saddle itself. In a 2-
manifold , the stable manifold of a maximum is an open

is the directional derivative of along . For example, if
we have a smooth curve

with
1

velocity vector disk, which is the union of a circle of integral lines and
then the derivative of can be computing maximum itself. All three cases are illustrated in Figure

using the gradient as VI.6. Note that the dimension of each stable manifold is

the index of the critical point that defines it,

,

,

.
The gradient vanishes precisely at all critical points of .
If we start at a regular point and follow the gradient we
-* *
trace out a path, which is a solution to the ordinary dif-
ferential equation

. This path is called an
integral line. It depends smoothly on the initial condition,
which is its regular starting points. Two integral lines can
therefore not cross. Neither can an integral line fork, and
because we can reverse the gradient vector field by con-
Figure VI.6: From left to right, that stable manifold of a min-
sidering , two integral lines can also not merge. The imum, a saddle, and a maximum of a two-dimensional Morse
patterns of integral lines in the neighborhoods of a regu- function.
lar and several critical points on a smooth 2-manifold are
shown in Figure VI.5
Each stable manifold is the injective image of an open
balls. However, as indicated by the examples in Figure
VI.6, the closure of a stable manifold is not necessarily
homeomorphic to a closed ball. Nevertheless, the clo-
sure of each stable manifold is the union of (open) sta-
ble manifolds. The collection of stable manifolds thus
satisfies the two conditions of an open complex: its cells
Figure VI.5: From left to right, the flow in the neighborhoods of partition and the boundary of every cell is a union of
a regular point, a minimum, a saddle, and a maximum.
other cells. By symmetry, everything we said about sta-
ble manifolds is also true for unstable manifolds. The

dimension of the unstable manifold of a critical point
is the co-dimension of the stable manifold,

Stable manifolds. Every regular point belongs to an in-

tegral line, and two maximal integral lines are either dis- .
Morse-Smale functions. We may refine the complexes Shape of Morse-Smale cells. Note that all 2-cells in
of stable and unstable manifolds by forming unions of Figure VI.7 have four sides, provided we count an arc
integral lines that agree on both limiting critical points. twice if it bounds the cell on both sides. In other words,
This amounts to overlaying the two complexes. In do- all two-dimensional Morse-Smale cells are quadrangles.
ing so, it is convenient to assume that the stable and un-
Q UADRANGLE L EMMA . Every 2-cell of a two-dimen-
*
stable manifolds intersect in a generic manner. To ex-

sional Morse-Smale complex is a quadrangle.
. The intersection is transversal
plain what this means, we consider a point common to

/ /

*
and
span the tangent P ROOF. The vertices of a 2-cell alternate between saddles
/
at if the tangent spaces and
space . Equivalently, the dimension of the intersec-
and other critical points, and the non-saddles alternate be-

/ /

tween minima and maxima. Any such cyclic sequence has

tion of the two tangent spaces is
.
length , for . We take two copies of a -gon and

A Morse-Smale function is a Morse function

glue them together along the shared boundary. Saddles be-

come regular points, minima remain minima, and maxima

whose stable and unstable manifolds intersect only remain maxima. The result is a topological 2-sphere with

transversally. For example, the height function of the up- minima and maxima. The Euler characteristic of the
right torus in Figure VI.1 is Morse but not Morse-Smale 2-sphere is , which implies .

because the stable 1-manifold of the upper saddle, , meets
The 3-cells of a Morse-Smale complex may have the

the unstable 1-manifold of the lower saddle, , along en-

structure of a cube, but they can also assume more gen-
tire one-dimensional

integral

lines,

. Morse-Smale
eral shapes with arbitrarily many saddles alternating be-
functions are again dense in the set of maps from to
tween index-1 and index-2 separating the minimum from
the maximum. The common features of all 3-cells are that
. In the case of the upright torus, it suffices to tilt it ever
they have one minimum and one maximum, and all 2-cells
so slightly sideways in order to get transversality. Assum-
in the boundary are quadrangles. A few examples of 3-
ing a Morse-Smale function, we define the Morse-Smale
cells are shown in Figure VI.8.
complex as the collection of connected components of in-
tersections of stable and unstable manifolds. We can see in
Figure VI.7 that it is indeed necessary to take components.
Figure VI.8: Three 3-cells of a three-dimensional Morse-Smale

complex. From left to right they have one, two, and three index-1
saddles and the same number of index-2 saddles.
Piecewise linear height functions. Height functions

over manifolds occur in many practical problems, but they
are never smooth in the mathematical sense of the word.
An example is a surface of a molecule model and the
electrostatic potential on this surface. The surface would
typically be given as a triangulating simplicial complex ,

as shown in Figure VI.9, and the function would be speci-
minimum saddle maximum
fied by its values at the vertices. Using linear interpolation,
Figure VI.7: Solid stable and dashed unstable 1-manifolds with we can extend these values to a continuous function over
overlaid dotted iso-lines of a rectangular portion of a Morse- the entire surface.
4
Smale function. The two bold 2-cells share the same origin and
*
We need some definitions to explain the linear interpo-
destination.
* 4
lation. Each point of a triangle is a convex com-
bination of the three vertices, with
VI.2 Morse-Smale Complexes 95
times between lower and higher values of as a -fold

saddle. This interpretation is consistent with the result that
regular minimum saddle maximum
Figure VI.10: The star of every vertex in the triangulation of a

2-manifold is an open disk. The shaded portions are lower stars.
the alternating sum of critical points is equal to the Euler

characteristic of . The alternating sum of simplices in

the lower stars of a regular point, minimum, saddle, maxi-
Figure VI.9: Portion of a triangulated surface of a molecule.

mum, and -fold saddle are , , , , and . It follows
immediately that is the number of minima and maxima
minus the number of saddles counted with multiplicity.

and . The three parame-

Another similarity between smooth and piecewise lin-

ters
*
are unique and referred to as the barycentric
* ear height functions arises when we sweep in the direc-

coordinates of . The value at is now defined as the
tion of increasing height. Assuming for
analogous combination of values at the vertices,
4
all , we sort the vertices in the order of increas-
*

ing height. Indexing the vertices accordingly, we define

Note that the barycentric coordinates of the vertex 4 of

as the the union of the first lower stars and note that
4 are , which implies that the is a simplicial complex. The sequence of complexes
and
linearly interpolated agrees with the value specified at 4 .

Furthermore, for points * along the edge 4 we have

is a filtration and a discrete version of the evolution of
. The values computed for * within the two triangles that

share 4 thus agree, which implies that is continuous.
during the sweep. Adding the lower star of a regular point
does not change the homotopy type of , and adding the
Lower stars. The height function

is con-
lower star of a critical point is similar to attaching a cell in
the smooth case.
with Morse functions. Define the star of a vertex

tinuous but not smooth. It still shares many characteristics
Bibliographic notes. The gradient and related concepts
as the collection of simplices that contain , and the lower from vector calculus are intuitively described in the book-
star as the subset for which is the highest vertex,

let by Schey [3]. The transversality condition for stable
and unstable manifolds has its origin in dynamical system

and is named after Steve Smale [4]. The Morse-Smale
complex has been introduced recently in [2] along with
algorithms for piecewise linear height functions over 2-
It is convenient to assume pairwise different height values
manifolds. The idea of writing a triangulated manifold as
at all vertices so that each simplex belongs to exactly one

lower star. With this assumption, the lower stars partition
the complex . Figure VI.10 illustrates the definitions by
the disjoint union of lower stars goes back to Banchoff [1].
showing the lower stars of vertices that behave like regular [1] T. F. BANCHOFF . Critical points and curvature for embed-
ded polyhedra. J. Differential Geometry 1 (1967), 245–256.
points, minima, saddles, and maxima. More complicated
lower stars are possible, and we cannot remove them just [2] H. E DELSBRUNNER , J. H ARER AND A. Z OMORODIAN .

by perturbing the height values. Instead, we may consider Hierarchy of Morse-Smale complexes for piecewise linear
a vertex whose circle of neighbors alternates

2-manifolds. Discrete Comput. Geom., to appear.

[3] H. M. S CHEY. Div, Grad, Curl and All That. An Informal

Text on Vector Calculus. Second edition, Norton, New York,
1992.
[4] S. S MALE . The Mathematics of Time. Essays on Dynam-

ical Systems, Economic Processes, and Related Topics.
VI.3 Construction and Simplification 97
VI.3 Construction and Simplifica-

tion
[Explain the sweep construction for two-dimensional
Morse-Smale complexes using the simulation of differetia-
bility.] [The most important part of the algorithm is maybe
the handle slide, which is the only restructuring operation
necessary to go between different complexes.] [That oper-
ation has been used in early work on Morse theory, maybe
the first time by Smale(?).]
[Build a hierarchy through prioritized cancellation.] [We
can describe the cancellation as a combinatorial restruc-
turing operation and we only need this one to go up the
hierarchy.] [Again, there should be reference to the early
mathematics literature on the topic of cancellation.]
Bibliographic notes.
[1] C. L. BAJAJ , V. PASCUCCI AND D. R. S CHIKORE . Visu-

alization of scalar topology for structural enhancement. In
“Proc. 9th Ann. IEEE Conf. Visualization, 1998”, 18–23.
[2] H. E DELSBRUNNER , J. H ARER AND A. Z OMORODIAN .

Hierarchy of Morse-Smale complexes for piecewise linear
2-manifolds. Discrete Comput. Geom., to appear.
[3] H. E DELSBRUNNER , J. H ARER , V. NATARAJAN AND

V. PASCUCCI . Hierarchy of Morse-Smale complexes for
piecewise linear 3-manifolds. Manuscript, Dept. Comput.
Sci., Duke Univ., Durham, North Carolina, 2001.
[4] M. VAN K REFELD , R. VAN O OSTRUM , C. L. BAJAJ , V.

PASCUCCI AND D. R. S CHIKORE . Contour trees and small
seed sets for iso-surface traversal. In “Proc. 13th Ann. Sym-
pos. Comput. Geom., 1997”, 212–220.
VI.4 Simultaneous Critical Points

[Explain the work with John on the topic and mention pa-
pers by Hassler Whitney and books in Catastrophy The-
ory.]
[1] V. I. A RNOL’ D . Catastrophy Theory. Third edition,

Springer-Verlag, Berlin, Germany, 1992.
[2] H. E DELSBRUNNER AND J. H ARER . Jacobian submani-

folds of multiple Morse functions. Manuscript, Duke Univ.,
Durham, North Carolina, 2002.
[3] T. P OSTON AND I. S TEWART. Catastrophy Theory and Its

Applications. Dover, Mineola, New York, 1978.
Exercises 99
Exercises
The credit assignment reflects a subjective assessment of
difficulty. Every question can be answered using the ma-
terial presented in this chapter.

1. Section of triangulation. (2 credits). Let be a

triangulation of a set of points in the plane. Let
@
be a line that avoids all point. Prove that intersects
at most edges of and that this upper bound

is tight for every

.
Chapter VII
Match and Fit
As a general theme in biology, questions are almost tween the two sets. In Section VII.3, we look at the re-
always about populations and rarely about individuals. lated problems of sampling a rigid motion and of covering
This is particularly true on the molecular level. The the space of such motions with small neighborhoods. In
molecules that participate in the mechanism of life tend Section VII.4, we apply the methods to questions of sim-
to be large and composed of small molecules. Minor ilarity and complementarity. In particular, we look at the
variations in the type or arrangement of the components problem of identifying matching subsequences with min-
are frequently inessential and do not alter the role of a imum root mean square distance and at score functions
molecule within the larger organization. But then again, that assess the shape complementarity of two space-filling
there are seemingly small variations that do have signif- diagrams.
icant consequences. The underlying question is one of
definition: when do we call two molecules the same or
of the same type, and how do we quantify and assess that
notion of sameness. There are various approaches to the VII.1 Rigid Motions
question applied to proteins, including the comparison of VII.2 Optimum Motion
amino acid sequences, space curves modeling backbones, VII.3 Sampling and Covering
and shapes formed by space-filling diagrams. Instead of VII.4 Alignment
asking how similar two shapes are, we may also ask the Exercises
related question of how well two shapes fit side by side.
The complementarity question is a similarity question be-
tween one shape and (a portion of) the complement of an-
other shape. It really makes sense only for space-filling
diagrams and does not seem to apply to information ex-
pressed in terms of sequences and space curves. The
similarity question is at the core of human understand-
ing, which crucially relies on classification to simplify and
create order. The complementarity question, on the other
hand, is at the root of natural and other re-production pro-
cesses and it takes part in protein interaction, which forms
the basis of functioning life.
As always in this book, we focus on mathematical and
algorithmic methods that shed light on the broader biolog-
ical issues. In Section VII.1, we explore rigid motions in
three-dimensional Euclidean space and introduce quater-
nions as a tool to specify and compute with rotations. In
Section VII.2, we study the problem of finding the best
rigid motion for matching one points set with another. The
measure of choice is the root mean square distance be-
101
102 VII M ATCH AND F IT
VII.1 Rigid Motions can be obtained by a sequence of three rotations about co-
ordinate axes. In general, the composition of any two ro-
A motion in three-dimensional Euclidean space can be de- tations is another rotation. Indeed, the rotations form the
so-called special orthogonal group of 3-by-3 matrices, ab-
composed into a rotation and a translation. In this section
we consider different ways to mathematically represent ro- breviated as SO . Note, however, that this group is not
tations, and we focus on quaternions, which provide a par- abelian because the multiplication of matrices and there-
ticularly elegant mathematical framework. fore the composition of rotations is not commutative. It is
important to specify the Euler angles in a fixed sequence
as other sequences of the same angles usually specify dif-
ferent rotations. It is mostly true that two different triplets
Rotation and translation. A rigid motion in is an

*
orientation-preserving isometry of three-dimensional Eu- of angles specify different rotations, but there are excep-
tions. Consider for example a rotation by about the -
*
clidean space. More formally, it is a map
FH*

F
and

F * F * axis, followed by a rotation by about the -axis,

*
such that

*

for every pair . As illustrated in * followed by a rotation by about the -axis and note

Figure VII.1, a rotation is a rigid motion that preserves the that we get the same composite rotation if we switch
origin, and a translation is a rigid motion that preserves and . In other words, the map
difference vectors. Every rigid motion can be written as
SO

x3 is not injective. This suggests that the Cartesian product

of three circles is not an appropriate model and we will
indeed see shortly that is not homeomorphic
to the space of rotations.
x1 x2
Quaternions. As an alternative to orthonormal 3-by-3
matrices, we may use quaternions to represent rotations.
Quaternions can be viewed as a generalization of complex
Figure VII.1: The translation of the boldface original coordinate numbers:
system preserves the directions of the axes while the rotation pre-
serves their anchor point.
I J K
where , , and are real numbers and

*
I , J and K
the composition of a rotation and a translation:
*
.
are three different imaginary units. In preparation of an
Using matrix notion, we can write , where
operation that multiplies two quaternions, we specify how
is an orthonormal 3-by-3 matrix with unit determinant
to multiply the imaginary units:
and is a 3-vector:
* ,

I J K

*
,

I K J
* ,

J K I
K J I

The rotation matrix moves the unit coordinate vectors to
and that make up the columns of
Note that reversing two different

imaginary units changes
the sign of the result. If K is another
the vectors ,
I J
quaternion then the product of and is
. A rotation about a coordinate axis has a comparatively
*
simple rotation matrix. For example, rotating about the

-axis gives

I

2

J
K

The angle of rotation about a coordinate axis is referred to The product has a similar form but six of the terms have
as an Euler angle. Leonhard Euler proved that any rotation their signs changed. Sometimes it is more convenient to
VII.1 Rigid Motions 103

think of a quaternion as a vector in . We can express cannot use simple multiplications to represent rotations
the product of two quaternions in terms of an orthogonal because the product of a unit quaternion and a purely
4-by-4 matrix and a vector. This can be done by expanding
either the first or the second quaternion to a matrix:
imaginary quaternion is not in general purely imaginary.
Instead, we use the composite product . Ob-
serve that

where
and

are the 4-by-4 matrices that correspond
to . We expand the product of the two matrices in Ta-

ble VII.1 and see that is purely imaginary. Furthermore,

since F F

, both and are orthonormal. It follows

that the lower right 3-by-3 submatrix of is also or-

thonormal. This 3-by-3 matrix is the familiar rotation ma-

to

Take a moment to verify that the matrices and are

trix that takes
. The justifi-
cation for to represent a rotation is not yet complete.
indeed orthogonal. differs from by having the lower Another possibility is that it represents a reflection, which
right 3-by-3 submatrix transposed. While the product of also preserves scalar products. However, a reflection re-
two quaternions is another quaternion, the scalar product verses the orientation of a sequence of three vectors, and
is a real number: we can check that composite multiplication does not. To
do this, we think of a quaternion as composed of a scalar
and a vector,

. The rules

for computing
with
can be rewritten as

F F
As usual, we can use the scalar product to define the length
of a vector:

. Similar to complex num-

bers, the conjugate of a quaternion

ing the imaginary parts:

is obtained by negat-
K . Ob-

I

J
serve that the matrices associated with are the trans-

poses of those associated with . Since the matrices are
When and
plify to

are purely imaginary then these results sim-

and . If we now apply
the composite product with a unit quaternion , we get
orthogonal, the products with their transposes are diago-
nal:

, where is the 4-by-4 identity ma-
and . Notice that
trix. Similarly, the imaginary parts vanish when we mul-
tiply a quaternion with its conjugate:

. This

implies that every non-zero quaternion has an inverse,

namely
. In the special case when has
Hence, is the result of applying the composite prod-

unit length, we have , and . uct with the unit quaternion to , which shows that the
composite product preserves cross-products, as required.
Representing rotations. We use
quaternions to represent vectors in
purely imaginary
and compound mul- Axis and angle. The expansion of given in Table

F F
tiplication with unit quaternions to represent rotations. We
start with a few properties, always assuming
VII.1 provides an explicit method for computing the or-
First, the scalar product

is preserved if we multiply

.
thonormal rotation matrix from the unit quaternion. In the
with . This is true from either side and we show it for

reverse direction, we show that the rotation by an angle

about the axis defined by the unit vector
multiplication from the left:
can be represented by the unit quaternion

2

because
. This implies in particular that multi-

I

J

K
plying with also preserves length:

.

As illustrated in Figure VII.2, an observer who looks
Same as rotation, multiplication with a unit quaternion against the direction of the axis sees the vector rotate in
neither changes the angle nor the length. However, we a counterclockwise order. The imaginary part of gives

F F

Table VII.1: Product of matrices in the representation of a rotation by composite multiplication with unit quaternions.
ux r

To prove the claimed correspondence, we write the vec-
tor rotated by about the axis defined by using the
θ formula of Rodrigues,
2

2

r,u u r’

r
which can be seen from Figure VII.2. We show that this
, where

can also be written in the form
, and as given above. Tedious but

straightforward calculations show
Figure VII.2: The rotation of the vector by an angle of about

the line spanned by . The three dotted vectors correspond to the

If we substitute and
terms in the formula of Rodrigues.

and use the
2
and 2

identities
the direction of the rotation axis, and the real part deter-
mines the angle of the rotation. Note that represents
then we obtain the formula of Rodrigues.
the same rotation as and that non-antipodal pairs of unit

quaternions represent different rotations. In other words,
Composing rotations. The above relationships provide
the unit sphere
in is a double cover of the space
of rotations in . Figure VII.3 illustrates the correspon-
a convenient conversion between unit quaternions and
F F FF
dence with a picture in one lower dimension. The space
axis-angle pairs. We have
and

,
F F

. The composition of two rotations

represented by the unit quaternions and is
x0

Thus, composition of rotations corresponds to multipli-
x1 x2 cation of quaternions, and from the product it is easy to
again get the axis and the angle. A more direct geomet-
ric description of the composition of two rotations uses
the fact that every rotation can be written as the composi-
tion of two reflections, as illustrated in Figure VII.4. The
Figure VII.3: The north- and south-poles correspond to the iden-
two planes defining the reflections are not unique; they
just need to pass through the axis of rotation and enclose
tity, and points on the equator correspond to rotations by .

The dashed great-circle through the two poles represents the set half the angle of rotation. To compose two rotations, we
of rotations about a fixed axis. write each as the composition of two reflections, making
sure that the second plane of the first rotation is also the
obtained by identifying antipodal points of is usually first plane of the second rotation, as in Figure VII.4. The

referred to as the real projective three-dimensional space, middle two reflections cancel and we are left with two re-
or
for short. It is a good model of the set of rotations flections. The axis of the corresponding rotation is the
in , although we usually prefer because it is easier to line common to the two planes, and the angle of rotation
imagine. is twice the angle enclosed by the planes.
VII.1 Rigid Motions 105
ρ
w
ϕ
ψ
u
v
Figure VII.4: We see three rotations defined by the axis-angle

pairs , and . Each rotation is the compo-
sition of two reflections illustrated by the great-circles at which
their planes meet the sphere.
Bibliographic notes. The exposition of quaternions and

their connection to rotations chosen for this section fol-
lows [2]. It is commonly acknowledged that quaternions
have been discovered by Hamilton in 1844 [1]. It is less
well known that a few years earlier, Rodrigues studied the
composition of rotations in space and gave a purely geo-
metric explanation that is equivalent to Hamilton’s algebra
[5]. Even earlier, Gauss recorded his discovery of quater-
nions in his unpublished notebook in 1819. We recom-
mend the primer by Kuipers [3] for background on rota-
tions and the text by Needham [4] for background on the
more general context provided by complex analysis.
[1] W. R. H AMILTON . On a new species of imaginary quan-

tities connected with the theory of quaternions. Irish Acad.
Proc. 2 (1844), 424–434.
[2] B. K. P. H ORN . Closed-form solution of absolute orienta-

tion using unit quaternions. J. Opt. Soc. Amer. A 4 (1987),
629–642.
[3] J. B. K UIPERS . Quaternions and Rotation Sequences.

Princeton Univ. Press, New Jersey, 1999.
[4] T. N EEDHAM . Visual Complex Analysis. Clarendon Press,

Oxford, England, 1997.
[5] O. RODRIGUES . Des lois géométriques qui régissent les

déplacements d’un système solide dans l’espace, et de la
variation des coordonnées provenant de ces déplacements
considérés indépendamment des causes qui peuvent les pro-
duire. J. Math. Pures Appl. 5 (1840), 380–440.
VII.2 Optimum Motion point for which the sum of the vectors to the points in the

collection vanishes:

In this section, we study an optimization problem that

arises when one attempts to match two molecular struc-
tures or to fit two structures snug next to each other. After
formulating the optimization problem, we solve it using
quaternions representing rotations in three-dimensional theFsum
from the . Indeed, *
This implies that minimizes
* F is a quadratic
of square distances

space.
function with a unique minimum. That minimum is char-
acterized by a vanishing gradient:

A *

Problem specification. Suppose we are given two finite
*

collections of points in and a bijection between them.

While entertaining the possibility that the two collections
are structurally the same or at least similar, we are in-
terested in moving one collection so it best matches the As mentioned earlier, the latter sum vanishes iff . *

other. We need some notation to make this precise. Let

We are now ready to prove that the best translation is the

one that moves to . Let us move every point to the

and be the

two collections and assume that corresponds to , for origin of and move the translated copy of with it

each . We use the root mean square or RMS distance to to . This operation is illustrated in Figure VII.5.
assess how similar the two collections are. This measure Then the sum of square distances between the correspond-
is the square root of the average square distance:

F F

Given a rigid motion

, we may apply it to

the first collection and recompute the root mean square
distance. We are interested in finding the rigid motion

that minimizes the root mean square distance
and .
between Figure VII.5: After moving the shaded points to the origin,
the (solid) difference vectors all radiate out from the origin.
F F
Note that minimizing the root mean square distance is
, is also the sum of square

equivalent to minimizing the sum of square distances. Re- ing points,

call also that every rigid motion can be decomposed into distances of the points from the origin. The

a rotation followed by a translation. The space of rigid translation minimizes the sum iff the origin is the centroid

motions is therefore six-dimensional, namely , of the points :
would be hopeless or at
and it might seem that computing the particular rigid mo-

tion that minimizes
least difficult. Quite the opposite is true, and the main rea-

son for this is the convenience provided by quadratic func-
tions. We consider rotations and translations separately. This implies that the best translation moves to , as
claimed.
Optimum translation. Recall that the centroid of a col-
and of are
lection of points is the average the points. More for- Optimum rotation. Note that rotating and taking the

mally, the centroids of and centroid commute. In other words, the centroid of

. We begin by showing that the best translation is . Since every rigid motion can be
written as a rota-
that
to the
moves to . In other words, the translation tion followed by a translation, , the motion can
* *
minimizes the root mean square distance between be optimal only if translates the centroid of

and is defined by . A crucial insight
used in proving this fact is that the centroid is the only
and independently translating such that
centroid of . We may therefore simplify our problem by
translating
VII.2 Optimum Motion 107

both centroids lie at the origin. Equivalently, we may as-

sume . Using quaternions, we can express the
rotation of a point as , where is a unit quater-

nion and is the pure imaginary quaternion that corre-
sponds to , as explained in Section VII.1. The
sum of the square distances after the rotation is

F F

F F F F
Figure VII.6: The plane represents , the partially dotted circle

represents , the surface represents the graph of the quadratic
function over , the dashed lines represent the zero-set and the
The sums of the F and the are not affected by F F F
boldface curve represents the graph of the restriction of that func-
the rotation, so minimizing is equivalent to maxi-

tion to .
mizing the sum of the
. Since multiplication

with a unit quaternion preserves scalar products, we have

. Recall from the previous sec-
point for which the quadratic function gives a max-
imum. We can compute such a with a modest amount of
tion that
linear algebra.

Recall that the eigenvalues of a square matrix are the

complex numbers for which the determinant of

vanishes. The corresponding eigenvectors are the unit vec-

tors such that . Letting , we

have four eigenvalues, and because is symmetric, the

eigenvalues

are all real. It is convenient to order them as
. The corresponding eigenvectors

are

pairwise orthogonal and therefore span . We can thus

*
write any quaternion as a linear combination of the eigen-
vectors,

, and because we are only interested
The two matrices are skew symmetric as well as orthogo- in unit quaternions, we have . Hence *
nal. The sum that we have to maximize can now be rewrit-
ten as

*

*

By the assumed

ordering of the eigenvalues, we have
, and this maximum is attained for . *

The corresponding quaternion is . In other words,
where . Take a moment to verify that each the optimum rotation is defined by the unit eigenvector
matrix in this sum is symmetric. Since the sum of sym-
metric matrices is again symmetric, we have . that corresponds to the largest eigenvalue.
Eigenvalues and -vectors. We can interpret ge- Without bijection. If there is no bijection specified be-
tween the two sets then the problem of finding the best
ometrically as a quadratic function over four-dimensional

rigid motion seems significantly more difficult. Assum-

Euclidean space. Short of being able to draw the graph of ing and contain points each, we could of course
this function in , we illustrate the idea in Figure VII.6, try all bijections, but that would take a long time. A
which drops two of the dimensions. Our goal is to find a more effective algorithm alternates between improving the
root mean square distance by changing the bijection and puter vision, the version that works with injections rather
by changing the motion. Note that independent of the bi- than bijections is known as the iterated closest point or
to the centroid of . So we may again assume that both
jection, the best translation always moves the centroid of ICP algorithm [1].
centroids are at the origin and restrict ourselves to rota- [1] P. J. B ESL AND N. D. M C K AY. A method for registration
tions. We use three subroutines to describe the iterative of 3-D shapes. IEEE Trans. Patt. Anal. Mach. Intell. PAMI-

algorithm. For a given rotation, M ATCH returns the 14 (1992), 239–256.
between
and . Given a permutation, ROTATE
permutation that minimizes the root mean square distance
[2] O. D. FAUGERAS AND M. H EBERT. The representation,
recognition, and locating of 3-D objects. Int. J. Robotics
returns the rotation that minimizes the mean square dis-
Res. 5 (1986), 27–52.
tance under this permutation. Finally, given a permutation

and a rotation, RMSD returns the root mean square [3] B. K. P. H ORN . Closed-form solution of absolute orienta-
distance. tion using unit quaternions. J. Opt. Soc. Amer. A 4 (1987),
629–642.
; identity; [4] W. K ABSCH . A discussion of the solution for the best rota-
loop M ATCH ; ROTATE ;
tion to relate two sets of vectors. Acta Crystallogr. Sect. A
if RMSD then 34 (1978), 827–828.
else exit
endif [5] G. S TRANG . Introduction to Linear Algebra. Wellesley-
forever. Cambridge Press, Wellesley, Massachusetts, 1993.
After each iteration, the root mean square distance de-

creases. This implies that no permutation is tried twice.
Since there are only finitely many permutations, it follows
that the algorithm halts. Note however that we neither
have a polynomial bound on the number of iterations nor
a guarantee that the algorithm finds the globally optimal
solution.

A popular version of the above algorithm uses injec-
tions from to instead of bijections. Sometimes this
change is motivated by the purpose of the computation, at
other times by the fact that finding the best bijection is not

entirely straightforward. Given

a rotation, we may use a

subroutine A SSOCIATE , which determines for each
the point closest to . In the algorithm, we replace
M ATCH by A SSOCIATE and do the remaining operations

as before, except that is replaced by the multi-set of
points in that are closest to some point in .

Bibliographic notes. The problem of finding the rota-

tion that minimizes the root mean square distance between

two point sets with given bijection in has been studied
in various fields, including x-ray crystallography [4] and
computer vision [2]. In this section, we follow the expo-
sition of the solution given by Horn [3]. For background
on linear algebra and how to compute the eigenvalues and
eigenvectors of a symmetric matrix, we refer to Strang [5].
The algorithm that attempts to minimize the root mean
square distance between two point sets without specified
bijection has also been described in several fields. In com-
VII.3 Sampling and Covering 109
VII.3 Sampling and Covering slices. The area of a slice is * ,

* *
ing the infinitesimal

with , as before. Hence,
In this section, we study two questions on rigid motions,
namely how to sample uniformly at random and how to * # *

cover the space of motions most economically. We treat
translations and rotations separately and spend most of our
Vol

* %

2

time on the more complicated case of rotations.

2
The size of a sphere. We prepare the discussion of sam-

pling rotations by measuring the unit 2-sphere and the unit

3-sphere. For embedded in , we sweep a plane nor-
*

mal to the -axis and compute the area by integrating in-

*
finitesimal slices. The perimeter of the circle in which the

plane cuts the sphere is
*
equal to

. Hence, *
, with the square radius which we get by substituting . The total vol- *
*

# ume of the 3-sphere is therefore
. Note also that
* Archimedes’ theorem does not extend to the 3-sphere, at
* %
Area least not in the straightforward manner from sections be-

*

tween parallel plane to sections between parallel hyper-
planes.

The total area of the 2-sphere is therefore . But note Uniform sampling. Archimedes’ theorem can be used
that the derivation shows more, namely that the area of the to pick a point uniformly at random on . The method
slice between two parallel planes at a constant distance is may be viewed as picking a point on the enclosing cylinder
the same for all such planes, as long as both intersect the and projecting it back to the 2-sphere:
sphere. This fact has been known already to Archimedes

and is often expressed by saying that the axial projection Step 1. Pick uniformly at random in .
from the sphere to an enclosing cylinder preserves area. 4
Step 2. Pick uniformly at random in .
J4
2 2
This projection is illustrated in Figure VII.7.

.
*
Define
Return .

*
We now extend this method to and thus to an algorithm
for picking a rotation uniformly at random. Think of as
the axis of rotation, so we just need to pick the angle of
rotation about this axis. It would not be correct to pick an
*
angle uniformly at random since this would favor small
dislocations of . Indeed, in the quaternions near the
identity would be more likely than those far away from the
identity. To pick the angle correctly, we return to what we
learned from the above volume computation. The angle of
rotation about the axis is twice the angular distance from
the identity on . In other words, . We

need to pick the angle from , not uniformly but
Figure VII.7: Illustration of Archimedes’ theorem implying that

from a density that favors angles near the middle of the
the sphere and the enclosing truncated cylinder have the same
interval. Specifically, the density is , normalized
area.
to have unit total integral. The corresponding distribution

/
function is
method to compute the volume of
We use the same
embedded in
*
. Sweeping

a three-dimensional hyper-
plane normal to the -axis, we get the volume by integrat-

*

2

*

which monotonically increases and reaches
*
at
. To pick
an angle, we pick a number uniformly

at random in , and we compute its preimage under

the distribution function:
. To get a point
uniformly at random on , we append Steps 1 and 2
above with

Figure VII.8: From left to right: the cube, the FCC and the BCC

lattices. The points with maximum distance to the lattice points
Step 3. Pick uniformly at random in .
are the cube centers, the edge centers and the midpoints between
Let .
2 the face and the edge centers.
2 .
Return

ume of the balls divided by the volume of the space they

We get a random rotation by using as a unit quaternion.
inhabit. We see that the FCC lattice leads to an effective
Alternatively, we get a random rotation by using , and
as Euler angles.
C UBE FCC BCC
points per cube 1 4 2
Covering the spaces of translations and rotations. We packing radius 0.500 0.353 0.433
turn our attention to selecting a collection of rigid mo- volume (fraction) 0.523 0.740 0.680
tions such that every possible motion has a selected mo- covering radius 0.866 0.500 0.559
tion nearby. It is convenient to measure the distance be- volume (fraction) 2.720 2.094 1.463
tween translations and between rotations using the Eu-
clidean metric. We will later analyze how these notions Table VII.2: Numerical assessment of how well the cube, the
FCC and the BCC lattices pack and cover.

of distance relate to the effect of the motion on the root
mean square distance between two sets in .
The idea of guaranteeing that every possible motion has packing while the BCC lattice leads to an effective cov-
a nearby selected motion can be expressed by covering ering. Indeed, both are known to be the respective best
the space of motions with neighborhoods. Consider first packing and covering lattices.

translations,

which we represent by 3-vectors or points in As an exercise we may estimate the number of balls we
and let
. Let

be a collection of closed balls,
all of radius . We call a covering if and we
need to cover the unit 3-sphere. Recall that its volume
is
. Assuming

the volume of
is very small,

call the covering radius. We need infinitely many balls a ball with radius in is about . If we believe
just because has infinite volume, but we are usually

that we cannot cover more economically than the BCC

only interested in bounded portions of space. If we use lattice in , we can use a straightforward volume
ment to show that we need at least

argu-

*
the centers of the covering balls as selected translations,
balls to cover the 3-sphere.
*
we are guaranteed that every translation has a se-
lected translation at a distance at most from . We study
three lattices of points in some detail. The cube lattice
consists of all integer points, the FCC or face-centered
cube lattice adds all centers of cube faces, and the BCC Sensitivity to small translations. Next, we address how
or body-centered cube lattice adds all cube centers to the

translations affect the root mean square distance between

VII.2, let and be two

cube lattice. Figure VII.8 shows the portion of each lat- two point sets. As in Section

tice inside a cube of unit side-length and Table VII.2 lists collections of points in , with a bijection that maps
some of their pertinent properties. By counting fractions, to , for each . To simplify the analysis, we assume that

we note that the FCC lattice has four times and the BCC the centroids of the two collections are both at the origin:
lattice has twice as many points as the cube lattice. The . This implies that the vectors add up to
*
*
packing radius is the largest radius we can assign to the 0 implying that the sum of scalar products with any vector

points to get non-overlapping balls, and the volume is the

fraction of the space covered by the packed balls. The cov-
vanishes:
. Recall that the
root mean square distance between and is the square

ering radius is the smallest radius we can assign and still
have the balls cover , and the volume is the total vol-
*
root of the average square distance between corresponding
points. After translating along , the root mean square
VII.3 Sampling and Covering 111
distance is direction opposite to the rotation axis. It is geometrically
*
F * F obvious that the total distance increases the fastest when

each point moves in a direction straight away from .

This is possible in the limit and characterized by the veloc-

F
F FH*KF

ity vector of being parallel to , which includes
the possibility that . As for translations, the length

which implies *

F H*KF . We have * FH*KF if and
of the gradient is maximized if

for all . In this
for all . To measure how fast the root mean

only if
case, we have and the root mean square

distance between and the rotated copy of is

F F
square distance changes with varying translation vector,
we compute the the gradient:

A *
* * *

*
*

except at *

are the eigenvalues of the
everywhere
its length is F0A * F
where

The gradient is defined and
for all . Figure VII.9 illustrates this result

if
. The length is 1 if and only matrix defined in the previous section. For the purpose
gradient and its length, we consider a
of computing the
by comparing the graphs obtained for equal and for non- function over :
equal corresponding points. Since the length of the gra-
A * * * *

F2A F
*

of , we observe , that

Going back to the definition the
and for

eigenvalues are

, where is the radius of gyration of

projected

into the plane *
.
* we simplify the expressions for , its
Figure VII.9: The hyperboloid approaches the graph of the norm in . Note that

function at plus and minus infinity. Using
gradient and the length of the gradient:
dient never exceeds 1, the difference between the root * * *

A * * *
mean square distances for two translations is bounded

from above by the norm of the difference vector:
*

FH*
F F0A F

In words, the root mean square as a function over the Since the length of the gradient never exceeds
, the
three-dimensional space of translations satisfies a Lips- difference between the root mean square distance for two
chitz condition with constant 1. rotations is no more than that multiple of the norm of the
difference vector:
Sensitivity to small rotations. We repeat the analysis

FH 'F

for rotations. Call the root mean square distance from the
, the radii of gyration of

and are
centroid the radius of gyration. Since we assume We see that the rotations satisfy a Lipschitz condition that
is similar to that for translations, except that the constant

F F
FF now depends on the collection of points, in particular to

their radii of gyration.
and
Let

* :* * :*

be a unit quaternion. The ef- Bibliographic notes. The problem of sampling motions
fect of the rotation represented by is best viewed in the has been studied in various fields, including statistics,
crystallography and molecular modeling. Various meth-

ods for picking a rotation uniformly at random have been
published but not all are correct. In particular, it is impor-
tant to notice that first picking a rotation axis and second a
rotation angle favors quaternions close to the identity if we

pick the angle uniformly at random in . A popular
method that is correct and different from the one described
in this section is due to Marsaglia [4] and is reproduced in
the exercise section of this chapter.
Packing and covering problems have been studied
within mathematics and have generated a large body of
literature [2, 3]. Surprisingly, many of the main ques-
tions in this area are still open. For example, it is not
known whether or not the BCC lattice is the most eco-
nomical covering of with congruent balls. Very little
is known about optimal packings and coverings in non-
Euclidean spaces. The problem is challenging even in the
relatively simple case of the 2-sphere, and for most num-
bers of points (or caps) only approximate solutions are
known [1].
[1] J. B ERMAN AND K. H ANES . Optimizing the arrangement

of points on the unit sphere. Math. Comput. 31 (1977),
1006–1008.
[2] J. C ONWAY AND N. S LOANE . Sphere Packings, Lattices

and Groups. Springer-Verlag, New York, 1988.
[3] L. F EJES T ÓTH . Lagerungen in der Ebene, auf der Kugel

und im Raum. Second edition, Springer-Verlag, New York,
1972.
[4] G. M ARSAGLIA . Choosing a point from the surface of a

sphere. Ann. Math. Stat. 43 (1972), 645–646.
VII.4 Alignment 113
VII.4 Alignment creasing the length. We turn the recurrence relation into
an algorithm:
In this section, we briefly discuss the two problems of
LCS

match and fit for protein structures. We begin by studying integer

:
;
how to match proteins and develop an algorithm that mea-
to do

;
;
for
sures the similarity between two chains of atoms. There-

for to do
after, we consider the related problem of docking a protein
:
if then
with its substrate.
else

endif

Longest common subsequence. Consider first the com- endfor
binatorial (as opposed to geometric) version of the se- endfor; return .
quence alignment problem. We model a protein as a

string over the alphabet of twenty amino acids: This algorithm is a typical example of the dynamic pro-

and . An alignment maps gramming paradigm, which constructs an optimal solution

the to the in sequence, but it permits spaces on both from pre-computed optimal solutions to sub-problems. To

sides. As illustrated in Table VII.3, we represent an align-
ment by a matrix consisting of two rows and store the solutions, the algorithm uses an array of en-

tries. Each entry takes constant time, which implies that
columns, where is the total number of spaces. A match the total running time is proportional to . Using a sec-
ond array of the same size, we may keep track of the deci-
Q R A A C C sions made by the algorithm, and with this extra informa-
A Q A C R C R tion, we can reconstruct the longest common subsequence
itself, and not just compute its length.
Table VII.3: The alignment uses spaces to achieve
matches.
Sequence alignment. The general alignment problem
is a column of two equal non-space characters, and a mis- permits mismatches and assesses the score by rewarding
match is a column with two different non-space charac-
ters. Columns with two spaces are disallowed. An inser-
*
each match and penalizing each mismatch, insertion and
deletion. Assuming gives the score for having *

tion is a column with a space at the top and a deletion is and in a single column, we get
a column with a space at the bottom. The common sub-

sequence between two strings consists of all matches, and

its length is the number of matches. For the moment, we

be the length of the longest common

restrict ourselves to alignments without mismatches. Let-
ting
subsequence,

We can think of every alignment as a directed path in the

is the minimum number

so-called edit graph of the two strings, which we illustrate

of insertions and deletions needed to transform to . in Figure VII.10. The path starts at the source in the upper
We compute by dynamic programming. Let be the

length of the longest common subsequence of

A Q A C R C R
and , and define for all and . Then

Q
:

if R

if

A deletion:
A insertion:
To verify the recurrence relation note that every alignment
ends with an insertion, a deletion or a match. In each case, C match:
removing the last column leaves an optimal alignment of C mismatch:
shorter strings. In the third case, we need to show that the

length of the common subsequence cannot increase if we

Figure VII.10: The edit graph for the strings in the above exam-
do not use the match between and . Indeed, without ple and the path that corresponds to the given alignment.
using that match we end with an insertion or a deletion,
and we may move the last match to the end without de- left corner, takes vertical, horizontal and diagonal edges
and ends at the sink in the lower right corner. A gap in

of all :

. This construction is
the alignment is a sequence of contiguous insertions or illustrated in Figure VII.11. Let be the motion that
and is then

of contiguous deletions. It is common to penalize a gap maximizes . The score of the best alignment between

separately for its existence and an additional amount that , and the best alignment is
for which

4
depends on its length. This may be done by penalizing an .
an amount
4
insertion or deletion an amount when it starts a gap and
when it continues a gap. This gives Γ
rise to the following recurrence relations:

: 4

: 4

where is the score of the best alignment that ends with

µ
an insertion and
is the score of the best alignment that
Figure VII.11: The horizontal axis represents the six-dimension-
ends with a deletion. Using three arrays, we can again al space of rigid motions. The upper envelope of the graphs is

compute the best alignment with dynamic programming the motion-wise maximum of the score functions.
in time proportional to .
The idea of the algorithm is to sample the space of mo-

Chains of atoms. We can use the same algorithmic tions dense enough to guarantee an alignment with a score

at least , for some . We thus aim at computing

ideas to compute alignments between two sequences of

atoms. Let the and the be the centers of the -carbon an approximately best alignment, but we may decrease
and thus get arbitrarily close to the optimum. This strategy
atoms along the backbones of proteins. For now, we
two
assume a fixed embedding in and consider the align- makes sense in practice since in any case the locations of
ment problem without applying any rigid motion. Using atoms are only known up to some precision.
the root mean square distance between two sub-chains is
problematic for two reasons. First, it does not lend it- Running time. Improving the approximation by de-
self to the dynamic programming algorithm and, second, creasing comes with a cost, namely higher running time
it prefers shorter over longer sub-chains. Instead, we need
a score function that balances the contributions of length

because we evaluate for more rigid motions. We quan-
tify the dependence by analyzing the running time depend-
and distance. One such function is obtained by combining

ing on . The other parameters entering the analysis are

square distances with gap penalties as follows. Letting
and , the radii of the small-
the lengths of the chains,

and be positive constants, we reward a match between

and by adding
est spheres enclosing and and the radii of gyration of
the two sets. Proteins tend to have globular shapes pack-

ing their atoms around their centroids. We may therefore

F
F
assume that the radii of are both roughly equal to

(VII.1)

are both roughly equal to . We

and the radii of
further simplify the discussion by assuming . To
to the score, and we penalize for gaps as before. The dy-
decide how dense we have to cover the space of rigid mo-
namic programming algorithm can still be used to identify

*
tions, we determine the sensitivity of the score function

the best in a collection of exponentially many alignments.
to small motions. We first consider translations .
It does this in time proportional to .
Ignoring penalties for gaps, we get

Next, we permit a rigid motion be applied to one of the
*

F *
F
chains, say . Instead of computing the best motion for

each alignment, we compute the best alignment for each
of a dense sample of motions. We need some notation to
where is the length of the alignment and the points are
formalize this idea. For each alignment between and

re-indexed so that maps to , for
. The norm
, we get a function

that

maps a rigid motion to the
of the gradient of a single term in this sum

.

score between

by a constant , and hence F2A * F
and . Consider the function is bounded
defined as the motion-wise maximum
VII.4 Alignment 115

@

We cover the space of translations with balls of radius notation to lay out the rules for this problem. Let and
. It follows that having a translation that is not
.

be the protein after applying a
represent the protein and the substrate in complexed form,

quite the optimum contributes at most to the error. and let
F F
. By covering
The sensitivity of to small rotations depends on the radii random rigid motion. The input to the reconstruction al-

gorithm consists of and and not knowing the solu-

of gyration, and we get
the space of rotations with balls of radius
.
, we tion means we can not use any information on and on
get again a contribution of at most to the error. . The goal is to find a rigid motion such that

.
and fit well. After is computed, we can test

By assumption on the shape of the protein, the volume
of translations we need to cover is proportional to , and how well we did by comparing with , which can be
and between and . .
done directly or by computing the root mean square dis-
the volume of the rotations is . In each case, we need
a constant times
balls. We cover the space of rigid
tances between and

motions by cross-products of these balls and thus get a We cannot use the root mean square distance to guide
constant times rigid motions. Multiplying this with our reconstruction of the complexed form and thus need
programming algorithm

the running time of the dynamic a score function that assesses how well a motion does
gives a total running time of . This is of course not in generating a good fit. There are many possibilities, and
practical and we need faster alternatives, some of which one is the approximation of the van der Waals potential by
will be mentioned at the end of this section. counting the pairs of spheres at small distance from each

for the van der Waals radii of the spheres in and
other. We think of the and as the centers and write
Protein re-docking. In protein docking, the basic ques-

and

F
. The collections of colliding and of close pairs are
F

tion is how well a proteins and its substrate fit to each
other. The substrate could be another protein or a small
ligand. We interpret this question as asking how similar F
F

the substrate is to a portion of the complement of the pro-
tein. This question makes sense if we use space-filling where
is a small positive constant. As mentioned in Sec-
representations of the protein and the ligand, but not if tion I.4, the van der Waals force is weakly attractive within
we represent them combinatorially or as chains of points small distances of maybe up to four Angstrom, and it is
in space. This idea is illustrated in Figure VII.12. For strongly repulsive for colliding van der Waals spheres. We

thus define
if

if

Given a rigid motion , we compute by comparing
all pairs of spheres in time proportional to . Improve-
ments of the running time are possible. Experiments show
that this score function is a good indicator of good fit, but
one weakness is its sensitivity to collisions. Actual pro-
teins are flexible and can avoid minor collisions by small
deformations. We may account for this fact by allowing
Figure VII.12: The shaded local complement of the left shape is a few collisions in the definition of , but to get a good
similar to the shaded portion of the right shape.
approximation of the reality, we will need to build knowl-
edge about flexibility into the score function.
protein-protein interactions, the region of local comple-
mentarity is frequently fairly large. The geometric fit be-
tween the two proteins thus becomes a significant factor Analysis. The general algorithm for re-docking is sim-
in making the interaction possible or, more accurately, in ilar to the one for geometric alignment: we explore the
not making that interaction impossible. Instead of pro- space of rigid motions and evaluate the score function at
tein docking, we consider the simpler re-docking problem. the centers of the balls used to cover the space. By choos-
Here we are given the complexed form of a protein and

ing the balls in the cover small enough, we can guarantee

.
its substrate and we attempt to reconstruct that form while that the root mean square distances between and

and between

suppressing any knowledge of the solution. We need some and are less than some

threshold . Note that this does not necessarily im- gested in [9]. It should be mentioned that the presented
ply that is large. Indeed, it could be zero because algorithm is significantly slower than the currently most
motions with high score value tend to be right next to mo- commonly used DALI software [5], but it is the only algo-
tions that generate collisions. In other words, whether or
not the algorithm recognizes as close to depends . rithm that guarantees a good approximation of the optimal
alignment in polynomial time.
on the shape of in this neighborhood. We can design The goal of protein docking is the prediction of whether,
cases in which has arbitrarily narrow high spikes and where and how proteins interact with each other and with
our algorithm has little chance to ever recover the com-
other molecules. In many cases, the surface area of the
plexed form. There is, however, experimental evidence interface during the interaction is substantial, and in these
that such configurations do either not exist or are rare for
cases the geometric fit is an important factor. However,
actual proteins.
there are cases with smaller interaction area in which
Let us return to the question how to cover the space of forces unrelated to geometric shape outweigh the impor-
motions to guarantee a root mean square distance of at tance of shape [2]. We refer to [4] for a recent survey

most . As before, we simplify the analysis by setting of the extensive literature on computational approaches to
and assuming that the radii of the smallest enclos- protein docking. The material in is this section is based on

ing spheres and the radii of gyration are all roughly equal the work described in [1].
to . According to the sensitivity analysis in the previ-
ous section, we may cover the space of translations with [1] S. B ESPAMYATNIKH , V. C HOI , H. E DELSBRUNNER AND

balls of radius and the space of rotations with balls of
radius , where is the radius of gyration of either
J. RUDOLPH . Protein docking by exhaustive search. Manu-
or . For the translations, we need to cover a volume script, Duke Univ., Durham, North Carolina, 2003.

of about requiring about

balls. For the rotations,
[2] A. H. E LCOCK , D. S EPT AND J. A. M C C AMMON . Com-
puter simulation of protein-protein interactions. J. Phys.

we need to cover a constant volume also requiring about
Chem. B 105 (2001), 1504–1518.

balls. The total number of rigid motions to be ex-
plored is thus proportional to , and multiplying this [3] D. G USFIELD . Algorithms on Strings, Trees, and Se-

with quadratic running time for evaluating the score func- quences. Cambridge Univ. Press, England, 1997.
tion , we get a total running time proportional to .

An improvement by a factor is possible if we compute
[4] I. H ALPERIN , B. M AO , H. W OLFSON AND R. N USSINOV.
Principles of docking: an overview of search algorithms and
for all translations composed with a single rotation in one a guide to scoring functions. Proteins 47 (2002), 409–443.

constant , this improves the running time to
sweep. For
roughly . Since is typically in the thousands, even this [5] L. H OLM AND C. S ANDER . Protein structure comparison
is not practical and we need faster alternatives. by alignment of distance matrices. J. Mol. Biol. 233 (1993),
123–138.
[6] L. H OLM AND C. S ANDER . The FSSP database of struc-

Bibliographic notes. The structural alignment problem
turally aligned protein fold families. Nucleic Acid Res. 22
refers to comparing the backbones modeled as curves or (1994), 3600–3609.
chains of spheres in three-dimensional space. Its impor-
tance within structural molecular biology derives from [7] R. KOLODNY AND N. L INIAL . Approximate protein struc-
the observation that evolution preserves structure better tural alignment in polynomial time. Manuscript, Stanford
than amino acid sequences. Among other things, re- Univ., Stanford, California, 2002.
search on this problem has lead to the creation of struc- [8] A. G. M URZIN , S. E. B RENNER , T. H UBBARD AND
tural databases [6, 8]. There are two main computational C. C HOTHIA . SCOP: a structural classification of proteins
approaches to structural alignment: one represents a chain database for the investigation of sequences and structures. J.
by its matrix of internal distances [5] and the other uses Mol. Biol. 247 (1995), 536–540.
rigid motions to align the chains embedded in space [9].
In this section, we have followed the second approach and [9] S. S UBBIAH , D. V. L AURENTS AND M. L EVITT. Struc-
tural similarity of DNA-binding domains of bacteriophage
presented the work of Kolodny and Linial [7], who explore
repressors and the globin core. Current Biol. 3 (1993), 141–
rigid motions in the outer loop and optimal alignments us-
148.
ing dynamic programming [3] in the inner loop of their al-
(VII.1), with constants and

gorithm. The particular score function given in Equation
, was sug-

Exercises 117
Exercises 5. Biased probability. Suppose Function U NIFORM

picks a real number uniformly at random in .

*
1. Reflections. The reflection through a plane maps (i) Show that the minimum of two numbers picked
*
every point to the point such that
* *
by Function U NIFORM is distributed according

*
crosses the line segment orthogonally at its mid- to the triangle density function .
J*
point. The central reflection maps every point to its
(ii) How are the minimum, the median and the
antipodal point .
maximum of three numbers picked by Function
(i) Show that every rigid motion is the composition U NIFORM distributed?
of two plane reflections.
(ii) How many plane reflections do you need to rep-
6. Sampling the 3-sphere. Prove that the following
method picks a point uniformly at random on :
resent the central reflection?
2. Sizes of spheres. The -dimensional unit
(i) Pick numbers
*
and
uniformly at ran-
*
dom in .
sphere consists of all points at unit distance from the

(ii) If or
*
then repeat Step
origin of the -dimensional Euclidean space:

* FH*KF

1, else let
*
return

.

and
We know that the perimeter of is , the area of

7. Random rotation. Let us mark a point on the unit

*
2-sphere. For a rotation , let
*
is and the volume of is . What is the be the image of
under that rotation. Any density function over the
-dimensional volume of ?
space of rotations implies a density function over the
from a point ,

FH*

F
3. Square distance from planes. The square distance

* * that the uniform density of quater-
2-sphere. Prove
*
nions over implies the uniform density of points

, is also the sum of square distances from over the 2-sphere.
the three planes parallel to the coordinate planes that

pass through . 8. Number of alignments. Recall that an alignment be-
tween two chains of and -carbon atoms that
uses spaces can be represented by a matrix with
(i) Show that the above claim holds for any three

columns. Assuming
planes that pass through and pairwise enclose
a right angle.

two rows and
, we define

and note that we

need insertions just to make up for the difference
FH* F
(ii) Area there triplets of planes enclosing non-right

is equal to the sum in length. The remaining spaces are distributed over
*
angles for which
of square distances from to the three planes? equally many insertions and deletions, so we define
.

4. Sum of square distances. Consider a collection of
points in and let be its centroid. (i) Show that is a necessary and suffi-
(i) Prove that for every point in space, the root * cient condition for the number of spaces in any

alignment of the two chains.
mean square distance to the is the root of the (ii) What is the number of different alignments with
square distance to the centroid plus a constant:
F* $ F
a fixed number of spaces?
*

(iii) What is the total number of different align-
ments?
What exactly is the constant?

(ii) Extend the construction to a collection of
planes in . In other words, prove that there

are three planes for which a similar formula
gives the sum of square distances to the
planes.

(iii) Further extend the construction to a collection
of lines in .
Chapter VIII
Deformation
VIII.1 Molecular Dynamics

VIII.2 Spheres in Motion
VIII.3 Rigidity
VIII.4 Shape Space
Exercises
119
120 VIII D EFORMATION
VIII.1 Molecular Dynamics

Newton’s second law. [
.]
Numerical integration. [Taylor expansion, different nu-

merical methods (Euler, Verlet, leap-frog, Beeman,
predictor-corrector).]
Hydrophobic surface area. [Weighted area and deriva-

tive (forward pointer to Chapter IX).]
Kinetic data structures. [Close neighbor lists, Delaunay

triangulation or dual complex (forward pointer to Section
VIII.2 and IX).]
VIII.2 Spheres in Motion 121
VIII.2 Spheres in Motion

[Explain the slack in the Pie Volume Formula (with a for-
ward pointer to Chapter IX.)] [This topic relates to the pos-
sibility of drawing non-straight Voronoi like decompositions
[2].] [Define cross-sections of the complex of independent
simplices and proof that each cross-section gives a differ-
ent pie formula but the same measurement.]

[Dynamic Delaunay triangulations [3]. Linear motion in
instead of .]
[Predict collisions of spheres.]
[1] J. BASCH , L. J. G UIBAS AND L. Z HANG . Proximity prob-

lems on moving points. In “Proc. 13th Ann. Sympos. Com-
put. Geom., 1997”, 344–351.
[2] H. E DELSBRUNNER AND E. A. R AMOS . Inclusion-

exclusion complexes for pseudodisk collections. Discrete
Comput. Geom. 17 (1997), 287–306.
[3] M. A. FACELLO . Geometric techniques for molecular

shape analysis. Ph. D. thesis, Report UIUCDCS-R-96-
1967, Dept. Comput. Sci., Univ. Illinios, Urbana, Illinois,
1996.
VIII.3 Rigidity
[Discuss the pebble algorithm that analyzes the rigidity of
a graph in three dimensions.]
VIII.4 Shape Space 123

VIII.4 Shape Space used to mix

skin surfaces and thus create a shape

space that encompasses -variate deformations.
[Explain the mixing of two or more shapes as a generaliza-
tion of 1-parametrized deformation. The problems of [1] H.-L. C HENG , P. F U AND H. E DELSBRUNNER . Shape
space from deformation. Comput. Geom. Theory Appl. 19
(1) finding a good basis, (2001), 191–204.
(2) finding the best approximation within the spanned
[2] S.-W. C HENG , H. E DELSBRUNNER , P. F U AND K. P.
space,
L AM . Design and analysis of planar shape deformation.
Comput. Geom. Theory Appl. 19 (2001), 205–218.
are both difficult. They are similar to fundamental ques-
tions on function representation, which are probably dis- [3] G. W OLBERG . Recent advances in image morphing. In
cussed in the approximation theory literature.] “Proc. Comput. Graphics Internat., 1996”, 64–71.
The main functionality of the Morfi software is that it
can smoothly morph between one skin curve to another.
In other words, it deforms the skin of one set of circles to
the skin of another. The details of this deformation will
be explained in Section VIII.4, where we discuss notions
of similarity between two molecular skins. In this section,
we merely illustrate the deformation and mention some of
its features in passing. Figure VIII.1 shows the deforma-
tion of a skin curve defined by four into one defined by
three circles. For each snapshot, we show the skin curve
together with the dual complex. We note that any two con-
tiguous bodies, except the last three in the sequence, differ
by at least one change in homotopy type. Recall that the
homotopy types of the body and the dual complex are al-
ways the same, which implies that they change their type
the same way and at the same time. For the complex we
observe two types of changes caused by adding an edge
or a triangle. The corresponding changes in the body are
caused by creating a handle or filling a hole. There is a
third type of change not seen in Figure VIII.1, which in
the he complex is caused by adding a vertex and in the
body by creating a component.
Bibliographic notes. The Morfi software has been used

in [2] to explain two-dimensional skin geometry and to il-
lustrate its use in deforming two-dimensional shapes into
each other. We note that these deformations are similar but
also different from the image morphs studied in computer
graphics [3]. The goal there is photo realism and possibly
the most difficult problem towards achieving it is the con-
struction of a one-to-one correspondence between features
of the initial and the final images. The Morfi software cre-
ates a few-to-few correspondence through geometric con-
siderations rather than working towards a one-to-one cor-
respondence, which often does not exist. Similar to two
dimensions, we can deform skin surfaces into each other
by continuously changing the defining spheres. A canon-
ical such method is explained in [1]. That method can be
Figure VIII.1: Ten snapshots of a deformation with skin and dual complex displayed. The skin in the fifth snapshot is the same as in the
figures above.
VIII.4 Shape Space 125
Figure VIII.2: From left to right and top to bottom: the shapes at times . The sequence is defined by a set of seven

spheres forming a question mark at time and a set of eight spheres forming a human-like figure at time .

Exercises


@

is tight for every

.
Chapter IX
Measures
There are various reasons why biologists want to mea-

sure the size of molecules. Volume is important in the
calculation of free energy and in estimates of populations
given a bound on the available space. Surface area is a
resource consumed by molecular interactions and is prob-
ably even more relevant to research in structural biology
than volume. This chapter will study three aspects of size:
volume, surface area, and arc length for such diagrams.
with
Our general approach to measuring the size begins
indicator functions for convex polyhedra in . From
these we will derive short inclusion-exclusion formulas for
size measurements.
IX.1 Indicator functions

IX.2 Volume and surface area
IX.3 Void formulas
IX.4 Measuring Software
Exercises
127
128 IX M EASURES

IX.1 Indicator Functions Below we will construct indicator functions of from Eu-
ler characteristics of subcomplexes of the boundary com-
The Euler relation for convex polyhedra is a special case plex. The Euler relation will follow from elementary
of the Euler-Poincaré theorem for complexes. There are proofs of properties of these indicator functions.
elementary proofs for this special case, and this section
Inclusion-exclusion. Let be the finite collection of
presents one that is inductive.
. For a subset

half-spaces such that
and a point we define
*
Convex polyhedra. A convex polyhedron is the inter-
section of finitely many closed half-spaces. It is either
bounded or unbounded, and both cases are illustrated in

if
Figure IX.1. In the first case, the polyhedron is the convex otherwise.

hull of finitely many points, and in the second, it extends
Note that is outside iff

to infinity. We study polyhedra in -dimensional space, for at least one non-

keeping in mind that
sion since polyhedra in
is the most important dimen-
relate to molecules in

, as
zero subset . Namely if
the outside and we have

then it sees a facet from
for the singleton

we will see later. set containing the half-space whose bounding hyperplane

contains that facet.
We form an alternating sum of the that leads to
an indicator function for the convex polyhedron. The
straightforward way of doing this is called the principle
of inclusion-exclusion. Particularly, we define

.

(IX.1)
Figure IX.1: A bounded convex polyhedron in to the left and

an unbounded one to the right.
set for which

The sum ranges over all subsets of , including the empty
for all points . We show that the
Let be a convex polyhedron in and assume it has

non-zero terms cancel unless there is only one non-zero

non-empty interior. A hyperplane supports if it in-

the boundary but not the interior,

contribution to the sum, which comes from the empty set.

tersects and
To see this define and
.
, which is the alternating sum

. A face of is the intersection with a
Note that
supporting hyperplane. The boundary is decomposed into

#
of subsets of . This sum is
faces of various dimensions, which are usually prefixed
for clarity. For example,

is a -face of itself and the
facets are the -faces. Let

be the number of -
%

faces. The Euler characteristic of is the alternating sum
of faces,
provided

. For we get and

. In words,
is an indicator function for

,

if

In the bounded case, the boundary is a -dimensional if

topological

sphere whose only non-zero Betti numbers are
. In the unbounded case, the boundary is
Truncation. Most of the terms in the exponentially long
an open -dimensional

topological ball whose only
non-zero Betti number is . Assuming general po- formula (IX.1) are redundant and can be removed. Specif-
ically, we only keep the terms that correspond to faces of
sition, the dual of the boundary complex is a simplicial
complex and the Euler-Poincaré Theorem stated in Sec- . Each face is the intersection of the polyhedron with a
subset of the hyperplanes bounding half-spaces in ,

tion IV.3 implies the Euler relation for convex polyhedra:

if

is bounded

if is unbounded

IX.1 Indicator Functions 129
For we get

, which we consider an im-

ones crossing the hyperplane shared by and , and the

proper face but still a face of . It is convenient to assume
, where
ones contained in . The corresponding systems form the
- -

general position, which in this context means that there partition

are no two subsets of that define the same face.

Let be the system

of subsets that define non-empty faces. For sets

there is an intuitive interpretation of . Consider

and
visible from if sees all facets around from outside

Note that

. The faces of are defined by sets in

. Notice that according to this definition,
the faces on
the silhouette are not visible. Then iff is , , , and the faces of are defined by sets in ,
visible from . The restriction of the inclusion-exclusion
, , where

formula (IX.1) to the system is

.

(IX.2)

We claim that even though

is much shorter than

, it The introduced systems partition , , and . We can
therefore write their values as sums of values of the
is still an indicator function of . This claim is sufficiently
subsystems,

important to warrant a complete proof.

P IE T HEOREM A.

if

if
and hence . We
argue that all three terms on the right side of the equation
P ROOF. We use induction over the cardinality of the set

for vanish. Both and have one less half-space

, which is again defined as the collection of half-spaces
not containing than does. The induction hypothesis
that do not contain . The basis of the induc-
tion is covered by

, in which case and

thus applies. By assumption,
iff

, which implies that
and therefore

.

, as required. Assume , let
The second term vanishes because all sets in con-
, and define as the closed complement of , which
iff

tain . The third term vanishes because
. We have
is a half-space that contains . Define sets of half-spaces

and . The correspond-
and

. The

for all

ing systems are
convex polyhedron

is obtained by remov-
. Therefore
cancel pairwise.
because the values

ing the constraint , and therefore
, as shown in Figure IX.2. We distinguish
, where
Unbounded convex polyhedra. The Pie Theorem A
implies the Euler relation for unbounded polyhedra. To
_ see this, we fix a point outside all half-spaces in , as in
g
Figure IX.3, and rewrite the formula in the Pie Theorem
P ’’ g
P
y
Figure IX.2: The half-spaces and share the hyperplane and Figure IX.3: The point lies in the intersection of the comple-
are complementary to each other. The union of and is . ments of the half-spaces.
three types of faces of

, the ones contained in , the A in terms of face numbers
. By assumption of general
130 IX M EASURES
D

position, is the number of sets with

cardinality Bounded convex polyhedra. We return to the compu-

. By the choice of , we have
and therefore

for all

tation of the Euler characteristic, this time for a bounded

convex polyhedron . We choose a line not parallel to any

polyhedra,

. This implies the Euler relation for unbounded convex
.
face of and points and sufficiently far in opposite di-
rections on the line. As illustrated in Figure IX.5, this par-
Restricting body. We need a slightly stronger version of Z Y

the Pie Theorem A to prove the Euler relation for bounded
y z
convex polyhedra. We first weaken the theorem by re-
stricting the points to lie within a convex body , and

then strengthen
it by further reducing the set system. De-

fine

let be the corresponding
and

sum of values. We show
Figure IX.5: The boundary of
is dotted, that of
is solid,
that for points

, is an indicator function for .

and the silhouette is indicated by the two hollow vertices.
P IE T HEOREM B. titions into the set of half-spaces that do not contain

and the set of half-spaces that do not contain . Each

if
proper face of either belongs to or to or to

if
the silhouette as seen in a view parallel to the chosen line.

Let be the number of -faces of that have non-
empty intersection with the interior of , and define

P ROOF. We construct

that contains

and

a convex polyhedron
approximates

in the sense that symmetrically. Let be the number of -faces in the sil-
, as in Figure IX.4. Define
houette. The projection of the silhouette onto a hyperplane
normal to the line is a bounded convex polyhedron

of dimension . We can now argue inductively that the

Euler characteristic of is

.

P
For , is a closed interval with
establishes the induction basis. For

we have

, which
A

PA

Observe that this sum counts the -face the same num-
Figure IX.4: Three edges and one vertex of intersects the in-

terior of , and the same edges and vertex intersect the interior

ber of times on both sides. On the right side it is counted

times, same as on the
of .

left side. We get

and use the Pie Theorem A to get

if

if
by the Pie Theorem B, using the respective other convex
By choice of , every point

polyhedron as the restricting convex body . Further-

is contained in all

half-spaces of . Hence
if
.

more,

The system contains exactly all sets
. Hence for

for which

all points
and therefore also for all points .
IX.1 Indicator Functions 131

the , , and implies
by induction hypothesis. Adding

the alternating

sums of
, as re-
quired.
Bibliographic notes. Most of the material in this sec-

tion is taken from [2], where the inclusion-exclusion ap-
proach to measuring the union of balls is laid out. As
demonstrated, this principle also yields the Euler relation
for convex polyhedra. The discovery of that relation for
convex polyhedra in three dimensions is usually attributed
to Ludwig Euler [3, 4], although there is evidence that
René Descartes knew about it a century earlier. There are
many proofs of that relation, and the historically first one
for the general -dimensional case goes back to the work
of Ludwig Schläfli [7] in the middle of the nineteenth cen-
tury. He implicitly assumes that the boundary complex of
every convex polyhedron is shellable, which has not been
established until 1972 by Bruggesser and Mani [1], who
thus filled the gap left in Schläfli’s proof.
We note that all authors of papers referenced in this sec-
tion are Swiss, except for one who has a Swiss grand-
mother. Indeed, finding elementary proofs of the Euler
relation for convex polyhedra seems to be a favorite topic
for Swiss mathematicians [5, 6].
[1] H. B RUGGESSER AND P. M ANI . Shellable decompositions

of cells and spheres. Math. Scand. 29 (1972), 197–205.

[3] L. E ULER . Elementa doctrinae solidorum. Novi Comm.

Acad. Sci. Imp. Petropol 4 (1752/53), 109–140.
[4] L. E ULER . Demonstratio nonnullarum insignium proprieta-

tum, quibus solida hedris planis inclusa sunt praedita. Novi
Comm. Acad. Sci. Imp. Petropol 4 (1752/53), 140–160.
[5] H. H ADWIGER . Eulers Charakteristik und kombinatorische

Geometrie. J. Reine Angew. Math. 194 (1955), 101–110.
[6] W. N EF. Zur Einführung der Eulerschen Charakteristik.

Monatsh. Math. 92 (1981), 41–46.
[7] L. S CHL ÄFLI . Theorie der vielfachen Kontinuität. Written

1850–52 and published in Denkschrift der Schweizerischen
naturforschenden Gesellschaft 38 (1901), 1–237.
132 IX M EASURES
IX.2 Volume and Surface Area

In this section, we use the indicator functions developed
in Section IX.1 to derive inclusion-exclusion formulas for
the volume, area, and total arc length of a space-filling
diagram.
Volume by integration. By definition, the indicator

function of a geometric set is 1 inside and 0 outside the

set. We can therefore compute its volume by integration.

Consider for example a bounded
and a convex polyhedron

convex body
. Let

Figure IX.6: A pyramid cut out of a ball by three half-spaces.
be the system of subsets of that appears in the statement

of the Pie Theorem B in the last section. The volume of
the intersection of the two convex bodies is

which implies that the area of the spherical triangle is
.
=

Stereographic projection. We now turn to the problem

. of measuring the union of a finite set of balls in .

We transform
the question into one about half-spaces in
.

. Let

with the hyperplane
*
be the unit 3-sphere with center at the origin
. Call

and identify

maps a point to the *
the north-pole of . The stereographic

. projection
* *

point

collinear with and . The map is bijective

and therefore has an inverse. If applied to all points of a

ball in , we get a cap of , which is the intersection

where is the closed complement of the half-space . As-
suming general position, the sets contain or fewer of the 3-sphere with a half-space . This is illustrated
half-spaces each. For measuring molecules, we are mostly in Figure IX.7. The half-space lies on the side of its
interested in the case , in which the volume is a sum
of terms each involving four or fewer half-spaces. N
In dimensions, the above formula gives a proof

that

of the area formula for spherical triangles. Recall
is the unit sphere centered at the origin . Let
be a set of three half-spaces whose bounding planes pass
through 0. The half-spaces intersect in an unbounded tri-
angular cone, and the intersection with the ball bounded
by is a pyramid whose base is a spherical triangle, as
shown in Figure IX.6. Let , , and be the dihedral an-

gles between the planes, or equivalently, the angles of the Figure IX.7: Stereographic projection from to .
spherical triangle. The volume of the pyramid can now be
hyperplane that does not contain the north-pole, so
computed by taking the ball, subtracting three half-balls,

adding
mid,

three sectors, and subtracting

the reflected pyra-

. It follows does contain . Let be the
collection of half-spaces that contain the north-pole. Then

that the volume is
is the stereographic projection of the portion of
=

that is not contained in the interior of

.

The area of the spherical triangle is three times the volume

Union of balls. Instead of computing the volume of

divided by the radius of the sphere. That radius is one, directly, we compute the volume of
IX.2 Volume and Surface Area 133

. Let be the 4-ball bounded by

and
the system of subsets of that appears in the
complex of and do inclusion-exclusion with a term for
every simplex in the dual complex. This is illustrated in
Pie Theorem B. The volume of the portion of outside Figure IX.8.
the polyhedron is

.

.

.

. =

Figure IX.8: The area of the union is the sum of eight disk areas
minus the sum of nine pairwise intersection areas, plus the sum
We could now get a formula for by scaling the vol- of two triple-wise intersection areas.

ume by the distortion factor of . A more straightforward
translates the
derivation of a formula for the ball union
inclusion-exclusion formula from to . Instead of the Area and length. Similar to volume, we get a Pie Area
system of half-spaces we now use a system of balls ob-
Formula for the surface area of ,
tained by substituting for . For convenience, we use

.

the same notation, namely for the system of balls and

for a generic set in .
P IE VOLUME F ORMULA . The volume of the union of a For we get

and therefore a zero con-

. =
finite set of balls is tribution to the area. To prove this formula, we add the

contributions of individual spheres. For a single sphere,
we use the Pie Volume Formula on the set of caps defined
by intersecting balls. Since the caps are two-dimensional,

the volume formula becomes an area formula. Letting

Dual complex, revisited. We observe that the index sys- be the sphere and the set of caps, the area of
.
tem in the Pie Volume Formula is an abstraction of the
dual complex

of . Instead of proving this alge-
is the area of minus the alternating sum of the areas of
cap intersections,

, where

braically, we explain the connection in geometric pictures.

is the abstraction of the dual complex of . For each
set of caps in the system , we have the corresponding

E
Start with and embedded in as suggested in set of balls together with the ball of in the system of .
Figure IX.7. For each ball we get a half-space , By summing over all balls, we get the Pie Area Formula
hedron

and the intersection of the half-spaces is a convex poly-
, which contains the north-pole in its
given above.

interior. Use to project the boundary complex of to Similarly, we can get a Pie Length Formula that mea-

. This is the weighted Voronoi
belongs
diagram

of . A subset sures the total length of the circular arcs in the boundary
of the union of balls,

.
to iff its correspond-

ing
face of has non-empty intersection with the ball

bounded by . But this is also the condition for the

projection of to have non-empty intersection with the
in

interior of . Hence, a non-empty set of half-spaces is
iff the corresponding set of balls defines a simplex
The sets
cause

with one or no half-space are redundant be-
in these cases. The proof of the for-
in the dual complex. We have arrived at a simple inter- mula is similar to the one for area, except that the sum-
pretation of the Pie Volume Formula: construct the dual mation is done over all circles that are intersections of two
134 IX M EASURES

spheres forming a pair in . For each such circle, we ap-
ply the (one-dimensional) Pie Volume Formula and thus
get an expression whose terms correspond to the simplices
in the star of the pair.
We might even go one step further and consider the
number of vertices of . The inclusion-exclusion for-
mula suggests that this number is the alternating sum of
each triple in

vertex numbers of common intersections of balls. For
we have a three-sided spindle with two
vertices, and for each quadruple we have a rounded tetra-
hedron with four vertices. For two or fewer balls we have
no vertices. It follows that in the generic case, the number
of vertices of is twice the number of triangles minus
four times the number of tetrahedra in the dual complex.
Bibliographic notes. In 1992, Naiman and Wynn

proved that the volume of a finite union of congruent balls
can be expressed by an inclusion-exclusion formula whose
terms correspond to the simplices in the Delaunay triangu-
lation of the centers [4]. Edelsbrunner generalized the for-
mula to allow for different size balls and strengthened it by
using the dual complex as the index system [1]. The ma-
terial in this section is taken from that paper. The proof of
the volume formula uses the inverse
of the stereographic
projection to transform balls in to half-spaces in .
That projection is conformal (preserves angles) and has a
number of other nice properties, many of which can be
found in the book by Thurston [5].

Just as a union of balls in
corresponds to a convex
polyhedron in , a union of intersections of balls corre-
sponds to a union of intersections of half-spaces. The lat-
ter is Hadwiger’s notion of a not necessarily convex poly-
hedron [3]. Inclusion-exclusion formulas for such polyhe-
dra can be found in [2].

[2] H. E DELSBRUNNER . Algebraic decomposition of non-
convex polyhedra. In “Proc. 36th Ann. IEEE Sympos.
Found. Comput. Sci., 1995”, 248–257.
[3] H. H ADWIGER . Vorlesungen über Inhalt, Oberfläche und
Isoperimetrie. Springer, Berlin, 1957.
[4] D. Q. NAIMAN AND H. P. W YNN . Inclusion-exclusion
Bonferroni identities and inequalities for discrete tube-like
problems via Euler characteristics. Ann. Statist. 20 (1992),
43–76.
[5] W. P. T HURSTON . Three-Dimensional Geometry and
Topology, Volume 1. Edited by S. Levy, Princeton Univ.
Press, New Jersey, 1997.
IX.3 Void Formulas 135
IX.3 Void Formulas

ery subset , there is a point inside every disk

in the subset and outside every disk not in the subset,

This section derives another collection of inclusion-
. This condition is equivalent to the

exclusion formulas that express the volume, surface area,
and arc length of a union of balls in . The new collec-
three circles decomposing into eight regions in the way
shown in Figure IX.10. Let , , and be the angles at the

tion leads to formulas for voids, which are bounded com-
ponents of the space outside the union.
c
Angles of revolution. A (one-dimensional) angle is by

definition the length of a unit circle arc and can assume c
any value between 0 and . A two-dimensional angle is a b a b
the area of a piece of the unit 2-sphere and can assume
any value between 0 and . It is convenient to normal-
ize so that in both cases the full angle is 1 and every an-
gle is a fraction of the full angle. This definition can be Figure IX.10: Both triangles are spanned by the centers of three
independent disks.

used in any dimension . For example, the 0-sphere is a
4 4 I4 4 4 4
pair of point with possible subsets the empty set, a single
vertices , , and . The left drawing suggests that the area
point, or both points. The only zero-dimensional angles

4 4 4
of the triangle is ,
are therefore 0, , and 1, and we will see shortly that this

4
where we write for the area of the disk with center ,
convention makes perfect sense when we compute volume
for the area of the intersection of the disks with centers

I4 4 4 4
using angles.
and , and so on. If we change the meaning from area to

Consider for example a tetrahedron . For each face perimeter we get .

, we define the angle as the fraction of directions Both formulas hold whenever the three disks are indepen-

around along which we enter . Equivalently, is the dent, but the right drawing in Figure IX.10 indicates that

volume fraction of a sufficiently small ball centered at an there are cases where the formulas are not as obvious as to

interior point of that lies inside the tetrahedron. Figure the left.
IX.9 illustrates the definition. In we refer to the two-
We generalize the formulas for independent triangles to
independent tetrahedra. To simplify the notation, we drop

the distinction between abstract and geometric simplices.
Specifically, we let denote an independent set of four
balls and, at the same time, the tetrahedron spanned by the
four ball centers. We use similar conventions for triangles,
edges, and vertices.

I NDEPENDENT VOLUME F ORMULA . The volume of an
independent tetrahedron is

. =
Figure IX.9: The solid angle at a vertex, the dihedral angle at an
edge, and the zero-dimensional angle of a triangle.
The proof of the formula is somewhat technical and
dimensional angle at a vertex as a solid angle, and the omitted. Similar to the two-dimensional case, we get
one-dimensional angle at an edge as a dihedral angle. sums that evaluate to zero if we replace volume by area
The zero-dimensional angle of a triangle is always . For or length,

convenience, we also define the angles of the improper
faces of as and .

.

.

Independent triangles and tetrahedra. Recall that a
collection of three disks in is independent if for ev-
136 IX M EASURES
Angle weights. We derive a new volume formula for a the same formulas for area and length, except that the first
union of balls by combining the Pie Volume and the In- sum vanishes:
dependent Volume Formulas. We first make the Pie Vol-

.

ume Formula more complicated and then simplify by can-
.
celling terms. It is convenient to cover the portion of
outside the Delaunay triangulation with tetrahedra. This

can be done by adding four points viewed as degenerate
balls to the set . We start with the Pie Volume Formula,
=

.

=

Voids. As defined earlier, a void of a union of balls is a
bounded component of the complement space, .

Figure IX.11 illustrates the fact that every void of is

and decompose into the parts defined by the tetrahedra contained in a void of . From a point inside the void,
that contain as a face,
=

We need some notation to continue. Let denote the set

of tetrahedra in a simplicial complex . Furthermore, for

a subcomplex , let denote the collection of

pairs with and . With this notation
we can rewrite the Pie Volume Formula as
=

. =

where is the Delaunay triangulation of . For example
=
for a tetrahedron , the only coface in is , Figure IX.11: Both voids in the union of disks is contained in a

the angle is , and the contributed term is , corresponding void of the dual complex.
as before. For triangles, edges, and vertices , the contri-

bution is split up into as many pieces as there are angles
around . Whenever is a tetrahedron in , we use the the union of balls looks a lot like from a point outside all
balls and voids. It is therefore not surprising that we can
Independent Volume Formula to make a substitution. This
rewrite the Angle-weighted Pie Volume Formula to get an

results in the new volume formula. We write for . expression for the volume of a void of . The cor-
responding void in is triangulated by a subset of the
A NGLE - WEIGHTED P IE VOLUME F ORMULA . The
vol- Delaunay triangulation. Strictly speaking, is not a tri-
ume of the union of a finite set of balls in is

angulation because it is not even a complex, missing the

simplices that bound the void in . The most straightfor-

.
ward translation of the angle-weighted formula suggests

we compute the volume of by first computing the vol-
=
ume of the corresponding void in and then subtracting
the volume of the fringe that reaches into that void.

VOID VOLUME F ORMULA . The volume of a void of
The new formula suggests we compute volume in two

with dual set is
steps. First we compute the volume of the underlying

=

space of itself, and second we add the volume of the
fringe, . Observe that not all pieces con-
.
sidered in the second sum are subsets of the fringe; some =

might reach into the interior of . Nevertheless, the
second sum is exactly the volume of the fringe. We get
IX.3 Void Formulas 137

and the total arc
Similarly, we get formulas for the area

to radius
. The first complex is the sequence is

length of by substituting for in the corresponding

and the last is , hence
as required by (ii).
Define

formulas of : and note that the underlying space of

.

is the void in that corresponds to the void in .

are contained in and

By choice of , the balls in

.
thus cannot contribute to the union of balls in any other

way than covering , as required by (iii).

Bibliographic notes. The material of this section is
Proof of void volume formula. The main idea in the taken from [1], which also contains a proof of the -
proof is to cover the void with small balls and measure the dimensional version of the Independent Volume Formula.
The implementation of the formulas are part of the Alpha
difference between the new and the old union. Let be

the set of balls we add, and consider

, , Shapes software and their use in structural biology has
been described in [2]. The Angle-weighted Pie Volume
and . We require that
Formula is related to Gram’s angle sum formula, which
states that the alternating sum of angles in a bounded con-
(i)
be finite,

vex polyhedron always vanishes,
(ii)
(iii)

be a subcomplex of
.
,

.

faces

= =
Assuming these three conditions, we have

In , this implies that the sum of angles at the vertices of

. The Angle-weighted Pie Volume For-
mulas for the two unions are

=
a convex -gon is , for the edges, minus 1, for the -

gon. Expressed in radians, this is

.

In , the sum of angles at the vertices is not longer deter-

.
=
mined by the combinatorial structure of the polyhedron,

but the sum of solid angles minus the sum of dihedral an-

=
gles is. A treatment of Gram’s angle sum formulas can be
=

found in Grünbaum [3, chapter 14].

. =


[2] H. E DELSBRUNNER , M. A. FACELLO , P. F U AND J.
The difference gives the Void Volume Formula. L IANG . Measuring proteins and voids in proteins. In “Proc.
28th Ann. Hawaii Internat. Conf. System Sciences, 1995”,
Finally, we construct so that (i), (ii), and (iii) are vol. V: Biotechnology Computing, 256–264.

satisfied. Assuming general position, there exists a posi-

tive with , where is obtained from
by reducing every ball with radius to radius .
[3] B. G R ÜNBAUM . Convex Polytopes. Wiley, Interscience,
London, England, 1967.
and have the same Voronoi diagrams and Delaunay
triangulations by the way we changed the radii, and they

have the same dual complexes by the choice of . Let
be a finite set of balls of radii with centers in the void

that covers . Let be the set of centers
and note that

the dual complex of is just together with
finitely many isolated vertices. Hence,

where the second containment follows because is

obtained from by growing every ball of radius
138 IX M EASURES
IX.4 Measuring Software the corresponding interval of -values. Measuring voids

takes about seconds on the author’s SGI Indigo II,
[Should we add a short discussion of Patrice’s new soft- and volbl outputs the measurements of all voids. The
output for the largest void in this example is
=
ware that also computes derivatives?] Volbl stands for

the ume of a union of a ls. It is part of the Alpha
Shapes software and can be used to compute the volume, measurements of void, index 845:
surface area, and total arc length of a ball union and its number of tetrahedra: 26
voids. tetra volume: 2.504511e+02
void volume: 1.009809e+01
surface area: 3.880316e+01
Running volbl. The software uses the files generated arc length: 5.776804e+01
by delcx and by mkalf that represent the Delaunay tri- number of corners: 34
angulation and its filtration, as explained in Sections II.3
and II.4. It is not necessary but a good idea to execute The index of the void is a unique but fairly arbitrary inte-
volbl in parallel with visualizing the alpha shapes of the
the tetrahedra
ger assigned during the process of collecting
same data, which we do by typing in the dual set. The measurements are in Å , Å , and Å, as
appropriate. While the largest void is more than ten times
> alvis name & as large as any of the others (in volume), it is still only
> volbl name of the order of one van der Waals ball. The correspond-
ing void in the dual complex is more than twenty times as
on the command line. The software will start with a di- large, which confirms out intuition about the size differ-
alogue narrowing down the options of what to compute. ence between the two representations. While measuring
As an example consider the measurements of voids in the voids, the software calculates for each ball its contri-
cdk2, which is an enzyme involved in the control of the bution to the void area and outputs the result in a new file,
growth process of a body cell. The voids shown in Fig- name.contrib.
ure IX.12 occur for the solvent accessible diagram defined
for Å. In other words, we look at the wire- Before exploring any of the other options in volbl,

frame of the dual complex defined by the balls with radii
, where is the van der Waals radius of the -th
we take a brief look at the algorithms used and the data
structures these algorithms require.
ball. After entering the index of the -complex, which
Algorithms and data structures. To measure a union
of balls using the Pie Volume, Area, and Length Formu-
las, we need a list of the simplices in the dual complex

of . This list is a prefix of the masterlist mentioned
in Section II.4. We simplify the actual situation insignif-

stored in an array

icantly by assuming that the simplices in

are
. The following pseudo-code is
then a direct implementation of the Pie Volume Formula
of Section IX.2.
;

.
for to do

;

endfor.
Figure IX.12: There are eight voids in the -complex of cdk2, The implementation of the Area and Length Formulas is
for

Å. Some of the voids have (open) dual sets similarly straightforward. The Angle-weighted Pie and
that seem connected in the image but are not because of missing Void Volume Formulas use the masterlist and in addition
triangles. require a representation of the voids. We use a partition of
we get as
from alvis, we pick the middle of ious voids,
-
the Delaunay tetrahedra into

-
-
the dual complex and the var-
, where

is-

IX.4 Measuring Software 139

the set of tetrahedra in the unbounded component of the
complement of . We have voids, each represented by a space-filling diagram
vol
Vsf
area
Asf
lgth
Lsf
crns
Csf
linear list of tetrahedra. We compute the lists by main-
voids
outside fringe
Vtv
Vof
Atv
Aof
Ltv
Lof
Ctv
Cof
taining a union-find data structure while scanning the mas- envelope Ve Ae Le Ce
terlist from back to front.

dual complex Vsh
dual sets of voids Vtiv

downto

for do ;

case . A DD ; Table IX.1: Cumulative measurements made by the Volbl soft-

case . let be the first and the second ware.

Delaunay tetrahedron that has as a face;

U NION F IND F IND
endfor.
The only trouble with this algorithm is that tetrahedra in

the unbounded component may be scattered in more than

one list. We fix this problem by adding a dummy tetra-
hedron to the system and setting whenever is
a triangle on the boundary of the Delaunay triangulation.
The following pseudo-code is a direct implementation of
the Void Volume Formula of Section IX.3.
;
forall tetrahedra

;
do

forall faces do

if ; then
.
Figure IX.13: The dual complex of the van der Waals diagram of
cdk2. The complex has vertices and no voids.

endif
endfor Asf = 3.100959e+04 Aof = 3.100959e+04
endfor. Lsf = 1.915391e+04 Lof = 1.915391e+04
Csf = 6388 Cof = 6388
The implementation of the Void Area and Length Formu-
las is similarly straightforward.
Note that the volume of the space-filling diagram is in-
significantly higher than that of the outside fringe. The
Options. The software computes the volume, area, difference is the volume of the dual complex, which is ap-
length, and also the number of vertices in the boundary, parently rather small. The surface area, total arc length,
which we refer to as corners. It does this for the space- and number of corners are of course the same for both.
filling diagram
, its voids, the outside fringe (defined The software also checks a few linear relations that should
plement of

as the portion of the unbounded component of the com-
that is covered by the balls), and the enve-
vanish provided the computations are correct. For exam-
ple, the sum of volumes of the space-filling diagram and
lope (defined as the space-filling diagram union all voids). its voids should be equal to the volume of the envelope,
Table IX.1 lists the main measurements made. As an ex- which in turn should be equal to the sum of volumes of
ample consider the van der Waals diagram of cdk2, whose the dual complex, the voids in the dual complex, and the
dual complex is shown in Figure IX.13. In the checking outside fringe. The specific relations checked by the soft-
option, the software computes all terms in Table IX.1 and ware are
prints a summary of the results. In the considered exam-
ple, it reports that there are no voids and it prints the sizes Vsf + Vtv - Vtiv - Vsh - Vof = 0.0
of the space-filling diagram and the outside fringe as Asf - Atv - Aof = 0.0
Lsf - Ltv - Lof = 0.0
Vsf = 3.034036e+04 Vof = 2.962563e+04 Csf - Ctv - Cof = 0
140 IX M EASURES
Another form of output is the description of the total mea-

surement as a sum of contributions over individual atoms.
whose edges are by definition great-circle
of that -gon is

arcs. The area
, where the sum adds

This makes sense for volume and area but is done only for all angles in the -gon. This is because a triangulation
the latter. Depending on the type of area measurement, produces spherical triangles each contributing one
the software outputs a file name.contrib that contains half times the sum of the three angles minus one quarter
to the area. To construct the -gon, we approximate each

the contribution of each individual atom. In the check-
ing option, the software compares for each atom the area of the two circles by a regular spherical -gon. The points

contribution to the space-filling diagram with the sum of are placed slightly outside the circles so that the areas of
the -gons are exactly the areas of the caps. Let and

contributions to the voids and the outside fringe. It also
be the angles in the two -gons. Assuming that and

checks whether the sum of contributions really add up to

the total area, and it does this for the space-filling diagram, are rational, we can find infinitely many integers so that

the voids, and the outside fringe. the two -gons share two vertices near the vertices of the
bigon. We then have

. The angles at the
two shared vertices approach as goes to infinity. Fur-
Area formula. All analytic formulas needed to measure thermore, the -gon has
vertices with angle
the common intersection of up to four balls are straightfor-
ward, except possibly the area of the intersection of up to
and

vertices with angle . To compute we re-
call that the area of the cap is

. By construction,
three caps. A formula for the area follows from the Gauss-

the area of the approximating -gon is the same,
namely

Bonnet theorem in differential geometry, but we prefer to

. Hence

,

derive it with elementary means. The cap on a sphere
consists of the portion inside the sphere . Equivalently, and symmetrically

. We plug the values
for and into the formula for the area of and

the cap contains all points whose power distance from

get
is no less than that to ,
* * *

after eliminating the terms that vanish when goes to in-

Let be the radius of
and
the radius of the cir-
finity. Similarly, for the intersection of three caps with an-
gles , , and and arc , , and we get
for the area of ,
lengths
cle bounding . We define the width of equal to

. Note that the formulas give the precise

the distance between the two planes that cut from ,

, as illustrated in Figure IX.14.

where
The area of the cap is then times the area of the

sphere , which is

.
area of the intersection of two or three caps since the ap-
proximating spherical -gon is only a tool in the proof
and not used in the formula.
ri pj Bibliographic notes. The structural biology litera-

ρj
wj
ture distinguishes between numerical and analytical ap-
ϕ ϕ
proaches to measuring molecules. For the latter approach,
pk
we would decompose the molecule into simple pieces and
give a formula for the size of each piece. An example is
Connolly’s work [1] on computing the area of a molecu-
lar surface. The idea of using inclusion-exclusion for size
Figure IX.14: To the left, the shaded cap has radius and computations goes back to Kratky [4], who shows that
width . To the right, the shaded bigon has angles and and there is a short inclusion-exclusion formula for the area
arc lengths and . of the intersection of a finite set of disks in the plane. His
proof is existential and superceded by explicit formulas

Consider now the intersection of two caps. Since all that can be derived by the same methods as described in
simplices in
intersection
are independent, we may assume that the
is a bigon, as shown in Figure IX.14.
Sections IX.1 and IX.2. Scheraga and coauthors [5] imple-
ment an inclusion-exclusion formula for a union of balls
We let be the angle at the two vertices and and the based on Kratky’s work, but the lack of an explicit expres-
lengths of the two arcs, all measured as fractions of a full sion occasionally leads to miscalculations [2]. A detailed
circle. We approximate the bigon by a spherical -gon, documentation of the Volbl software is given in [3].
IX.4 Measuring Software 141
[1] M. L. C ONNOLLY. Analytical molecular surface calcula-

tion. J. Appl. Cryst. 16 (1983), 548–558.
[2] L. R. D ODD AND D. N. T HEODOROU . Analytic treat-

ment of the volume and surface area of molecules formed
by an arbitrary collection of unequal spheres intersected by
planes. Molecular Physics 72 (1991), 1313–1345.
[3] H. E DELSBRUNNER AND P. F U . Measuring space filling

diagrams and voids. Rept. UIUC-BI-MB-94-01, Beckman
Inst., Univ. Illinois, Urbana, Illinois, 1994.
[4] K. W. K RATKY. The area of intersection of equal circular

disks. J. Phys. A: Math. Gen. 11 (1978), 1017–1024.
[5] G. P ERROT, B. C HENG , K. D. G IBSON , J. V ILA , A.

PALMER , A. NAYEEM , B. M AIGRET AND H. A. S CHER -
AGA . MSEED: a program for rapid determination of acces-
sible surface areas and their derivatives. J. Comput. Chem.
13 (1992), 1–11.
142 IX M EASURES
Exercises


@

is tight for every

.
Chapter X
Derivatives
The derivative of surface area under deformation is an

important term in the simulation of molecular and atomic
motion. In the case of van der Waals or solvent accessible
diagram, it is related to the length of the circular arcs in
the boundary.
X.1 Implicit Solvent Model

X.2 Weighted Area Derivative
X.3 Weighted Volume Derivative
X.4 Derivative Software
Exercises
143
144 X D ERIVATIVES
X.1 Implicit Solvent Model

[Give a general introduction and work out the relationship
with area and volume derivatives.]
X.2 Weighted Area Derivative 145
X.2 Weighted Area Derivative

[Talk about the unweighted and the weighted area deriva-
tives.] [Explain the results and disucuss the continuity is-
sue of the functions.]
[1] R. B RYANT, H. E DELSBRUNNER , P. KOEHL AND M.

L EVITT. The area derivative of a space-filling diagram.
Manuscript, Duke Univ. Durham, North Carolina, 2002.
146 X D ERIVATIVES
X.3 Weighted Volume Derivative

[Talk the unweighted and the weighted volume derivatives.]
[Explain the results and disucuss the continuity issue of the
functions.]
[1] H. E DELSBRUNNER AND P. KOEHL . The weighted vol-

ume derivative of a space-filling diagram. Manuscript, Duke
Univ. Durham, North Carolina, 2003.
X.4 Derivative Software 147
X.4 Derivative Software

[Discuss Patrice’s ProShape software.]
148 X D ERIVATIVES
Exercises


@

is tight for every

.
S UBJECT I NDEX 149
Subject Index Dirichlet tessellation, 19

DNA (deoxyribonucleic acid), 2
dual complex, 20, 101
active site, 7 dual set, 69
affine combination, 28
affine hull, 28
edge contraction, 40
alpha complex, 21
edge flip, 40
alpha shape, 21
electron, 9
Alpha Shape software, 23
element, 9
amino acid, 5
-sampling, 36
angle, dihedral, 103
Euler characteristic, 51, 96
, solid, 103
Euler relation, 96
area, 100
Euler-Poincaré theorem, 96
atom, 9
exact arithmetic, 24
atomic number, 9
atomic weight, 9
attachment, 60 face (of a polyhedron), 96
face (of a simplex), 48
facet, 96
backbone, 5
filtration, 21, 24
barycentric coordinates, 65
fundamental theorem of linear algebra, 51
basis (of a group), 51
Betti number, 51, 114
, persistent, 57 Gauss map, 32
body (inside a skin), 29 Gaussian curvature, 32
boundary group, 49 gene, 3
boundary homomorphism, 49 genome, 2
Brunn-Minkowski theorem, 116 geodesic, 32
gluing map, 60
Gouraud shading, 40
canonical basis, 57 gradient, 63
cell (in a complex), 60 graphical user interface, 23
central dogma, 1 group, 48
chain, 48
chain complex, 49
Helly’s theorem, 116
chromosome, 3
Hessian, 61
closed ball property, 35
homeomorphism, 44
coaxal system, 29
homology class, 49
codon, 5
homology group, 49
coherent triangulation, 19
, persistent, 57
Connolly surface, 16
homomorphism, 48
continuous function, 44
homotopic map, 44
contractible, 45
homotopy equivalence, 44
convex combination, 28
homotopy type, 45
convex hull, 28
homotopy, 44
convex polyhedron, 96
coordinate system, 60
Corey-Pauling-Koltun model, 16 image (of a function), 48
coset, 48 inclusion-exclusion, 96
critical point, 61 independent collection, 20
, non-degenerate, 61 independent simplex, 20, 103
critical point theory, 59 index (of a critical point), 61
curvature (of a curve), 32 indicator function, 96
, Gaussian, 32 integral line, 63
, mean, 32 interval tree, 24
, normal, 32 isomorphism, 48
, principal, 32
cycle group, 49 Johnson-Mehl model, 16
join, 45
deformation retraction, 45
Delaunay triangulation, 23 kernel, 48
, restricted, 35
, weighted, 18 length scale, 36
diffeomorphism, 60 length, 100
differential topology, 62 Lennard-Jones function, 11
dihedral angle, 103 linear algebra, 62
150 S UBJECT I NDEX
linear independence, 51 regular simplex, 24

lower star, 65 regular triangulation, 19
replication (of DNA), 3
manifold, 60 residue, 5
map, 44 restricted Delaunay triangulation, 35
matrix (of a homomorphism), 55 restricted Voronoi diagram, 35
mean curvature, 32 ribosome, 6
mesh, 35 RNA (ribonucleic acid), 3
metamorphosis, 41
Minkowski sum, 30, 116 signature, 25, 71
mixed cell, 30 simplex, 48
mixed complex, 30 simplicial complex, 48
molecular mechanics, 10 simulated perturbation, 24
molecular skin, 27 singular simplex, 24
molecular surface, 15 skin, 29
molecule, 9 Skin Meshing software, 40
Morfi software, 39 smooth manifold, 60
morphing, 84 smooth map, 60
Morse complex, 64 solid angle, 103
Morse function, 61 solvent accessible surface, 15
Morse theory, 59 space-filling diagram, 14
Morse-Smale function, 64 specificity, 7
mouth (of a pocket), 69 speed (of a curve), 32
spherical triangle, 100
neutron, 9 stable manifold, 63
NMR (nuclear magnetic resonance), 23 star, 65
normal curvature, 32 stereographic projection, 100
normal form, 55 subspace topology, 44
normal form algorithm, 55, 114 supporting hyperplane, 96
normal vector, 32
nucleotide, 2 tangent space, 60
tangent vector, 32, 60
open ball, 44 topological equivalence, 44
open set, 44 topological space, 44
open set (of simplices), 72 topological subspace, 44
orthogonal spheres, 18 topological type, 44
orthosphere, 18, 22 topology, 44
transcription (of DNA to RNA), 4
transversal, 64
parametrization, 60 triangulation, 35, 48
partial order, 69 , coherent, 19
pdb-file, 23 , regular, 19
pencil (of circles), 28 , weighted Delaunay, 18
persistent Betti number, 57
persistent homology group, 57
union-find, 56, 107
piecewise linear, 65
unstable manifold, 63
pocket, 68
polyhedron, 102
, convex, 96 van der Waals potential, 10
potential energy, 11 van der Waals radius, 23
power diagram, 17 van der Waals surface, 15
power distance, 17 vector field, 63
principal curvature, 32 velocity vector, 32
principal simplex, 24 vertex insertion, 40
principle of inclusion-exclusion, 96 void, 69, 104
protein, 5 Volbl software, 106
Protein Data Bank, 23 volume, 100
proton, 9 Voronoi diagram, additively weighted, 15
, restricted, 35
, weighted, 17
quotient group, 48
x-ray crystallography, 23
Ramachandran plot, 6
rank (of a group), 51
regular point, 61
AUTHOR I NDEX 151
Author Index Gelfand, I. M., 19

Gerstein, M., 11, 26, 93
Akkiraju, N., 16 Giblin, P. J., 22, 34, 50
Alberts, B., 8 Gibson, K. D., 109
Alexandrov, P. S., 22 Gilliland, G., 26
Amenta, N., 38 Grünbaum, B., 105
Ashcroft, N. W., 11 Griffith, A. J. F., 4
Aurenhammer, F., 16 Gromov, M., 117
Guibas, L. J., 83
Guillemin, V., 62
Bader, R. F. W., 79
Bajaj, C. L., 77
Banchoff, T. F., 65 Hadwiger, H., 99, 102
Basch, J., 83 Harer, J., 65, 77
Berman, H. M., 26 Helly, E., 117
Bern, M., 38 Hughes, J., 42
Besl, P. J., 93
Bhat, T. N., 26 Johnson, A., 8
Billera, L. J., 19 Johnson, W. A., 16
Bondi, A., 11 Jorgensen, W. L., 11
Bourne, P. E., 26
Bray, D., 8 Kapranov, M. M., 19
Bronson, H. R., 8 Kelley, J. E., 46
Bruce, J. W., 34 Kirkpatrick, D. G., 22
Bruggesser, H., 99 Klee, V., 117
Kratky, K. W., 109
Capoyleas, V., 117 Kuntz, I. D., 70
Casati, R., 70
Cheng, B., 109 Lam, K. P., 42, 84
Cheng, H.-L., 34, 38, 42, 84, 87 Leach, A. R., 11, 16
Cheng, S.-W., 42, 84 Lee, B., 16
Chew, L. P., 38 Leiserson, C. E., 54, 114
Chothia, C., 11 Leray, J., 46
Clifford, W. K., 31 Letscher, D., 58, 76, 114
Connolly, M. L., 16, 109 Levitt, M., 93
Corey, R. B., 8 Lewis, J., 8
Cormen, T. H., 54, 114 Lewontin, R. C., 4
Creighton, T. E., 8 Liang, J., 70, 74, 105, 115
Crick, F. H. C., 4 London, F., 11
Darboux, M. G., 31 Mücke, E. P., 22, 26

Darby, N. J., 8 Maigret, B., 109
Delaunay, B. (also Delone), 19 Maillot, P.-G., 92
Delfinado, C. J. A., 54 Mani, P., 99
Dey, T. K., 34, 38 Martinetz, T., 38
Dirichlet, P. G. L., 19 McCleary, J., 58
Dodd, L. R., 109 McKay, N. D., 93
Mehl, R. F., 16
Edelsbrunner, H., 16, 19, 22, 26, 31, 34, 38, 42, 46, Mendel, G., 4
54, 58, 65, 70, 74, 76, 77, 82, 84, 87, 99, 102, Mermin, N. D., 11
105, 109, 113, 114, 115 Miller, J. H., 4
Eilenberg, S., 54 Milnor, J., 62
Euler, L., 32, 99 Morse, M., 62
Munkres, J. R., 46, 50, 58
Facello, M. A., 70, 74, 105, 115
Feiner, S., 42 Naiman, D. Q., 102
Feng, Z., 26 Nef, W., 99
Foley, J., 42 Neyeem, A., 109
Forman, R., 70, 115
Frobenius, G., 31 O’Neill, B., 34
Fu, P., 16, 25, 42, 84, 87, 105, 109
Palmer, A., 109
Gauss, C. F., 19, 32 Pascucci, V., 77
Gelbart, W. M., 4 Pauling, L., 8
152 AUTHOR I NDEX
Pedoe, D., 31 Will, H.-M., 16

Perrot, G., 109 Woodward, C., 74
Poincaré, H., 54 Wynn, H. P., 102
Pollack, A., 62
Zelevinsky, A. V., 19
Qian, J., 16 Zhang, L., 83
Zomorodian, A., 58, 65, 74, 76, 77, 114
Raff, M., 8
Ramachandran, G. N., 8
Ramos, E. A., 82
Richards, F. M., 16, 26
Rivest, R. L., 54, 114
Roberts, K., 8
Rotman, J. J., 50
Sasisekharan, V., 8
Schütte, K., 113
Scheraga, H. A., 109
Schey, H. M., 66
Schikore, D. R., 77
Schläfli, L., 99
Schneider, R., 117
Schulten, K., 38
Seidel, R., 22
Seifert, H., 46, 62
Shah, N. R., 38
Sharir, M., 113
Sherwood, E. R., 4
Shindyalov, I. N., 26
Smale, S., 66
Steenrod, N., 54
Stern, C., 4
Storjohann, A., 58
Strang, G., 62
Stryer, L., 8
Sturmfels, B., 19
Sullivan, J., 34, 38
Taylor, R., 11
Theodorou, D. N., 109
Threlfall, W., 46, 62
Thurston, W. P., 102
Tirado-Rives, J., 11
Tsai, J., 11
Van Dam, A., 42

Van der Waals, 11
Van der Waerden, B. L., 113
Van Krefeld, M., 77
Van Oostrum, R., 77
Varzi, A. C., 70
Veltkamp, R., 91
Vila, J., 109
Vleugels, J., 91
Voronoi, G., 19
Wagon, S., 117

Wallace, A., 62
Walter, P., 8
Wang, Y., 77
Watson, J. D., 4
Weissig, H., 26
Westbrook, J., 26

1253325352012T01BioGeometry 1

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

1253325352012T01BioGeometry 1

Hochgeladen von

Copyright:

Verfügbare Formate

INTRODUCTION

S UBJECT I NDEX 147

General Fix the software for creating the index and

DNA RNA Protein I.1 DNA and RNA

guanine cytosine thymine

Figure I.2: The chemical structure of the DNA nucleotide with

O special protein). The beads of wrapped histones assume a

1. RNA is a single-stranded nucleotide chain and can

Initiation. RNA polymerase binds to a promoter segment

The four neighbors of an -carbon, C , are at the vertex

positions of a tetrahedron around C . This tetrahedron has

All unlabeled nodes are either carbon or hydrogen atoms.

U Trp Ser Ser Leu Leu C G

Bibliographic notes. Most of the twenty amino acids

[1] B. A LBERTS , D. B RAY, A. J OHNSON , J. L EWIS , M.

[2] N. BAN , P. N IESSEN , J. H ANSEN , P. B. M OORE AND T.

[3] T. E. C REIGHTON . Proteins: Structures and Molecular

I.3 Structural Organization are physically prohibited collisions between atoms. A

Bond rotation. Consider the three bonds from one -

relatively stretched (zig-zag), and the cis form, in which

tubes. In Figure I.13 the tubes are visible as spiral sections

Quaternary structure refers to the spatial arrangement of

A single protein may indeed contain more than one

ATOM N ARG  

Table I.3: Incomplete records of the atoms that belong to an argi-

Bibliographic notes. The Ramachandran plot for real-

also form a covalent double bond, which forces the nu-

bond type strength in

/ BAC / D of a potential

by a quadratic function. The strength

Bibliographic notes. The first half of this section is a

Exercises Base (www.rcsb.org/pdb) and the Swiss Bioin-

4. Lattices. The arrangement of atoms in a folded pro-

(i) The face-centered cube (or FCC) lattice con-

5. Structure Repositories. Descriptions of protein

II.1 Space-filling Diagrams 

obtained by growing every disk 

Figure II.2: On the outside, the boundary of the union of uni-

the rolling circle describes the rounded boundary, which

cally tight. However, the numbers for well packed sets of

spheres in Figure II.3 have radii

 and vertices in the boundary of a

kernel. It follows in particular that 

We describe the same complex as a Voronoi diagram of  

and every . We can now see how structural differ-

ball is contained in the interior of the other then its cell

perboloid. Observe that for every point  , the line

segment connecting and lies entirely in  . In ge-

[4] W. A. J OHNSON AND R. F. M EHL . Reaction kinetics in

[5] A. R. L EACH . Molecular Modeling. Principles and Appli-

[6] B. L EE AND F. M. R ICHARDS . The interpretation of pro-

[7] F. M. R ICHARDS . Areas, volumes, packing and protein

[8] H.-M. W ILL . Computation of Additively Weighted Voronoi

II.2 Power Diagrams FH*  F    . We have

 , is sometimes referred to as the weight of the point .

The first order approximation of the growth is one half the

   circles are nested.

triangles, and vertices become tetrahedra. Similarly, we

Number of simplices. We refer to an element of a De-

There are Delaunay triangulations that have almost this

Orthospheres. Suppose for a moment that the balls 

[5] J. E RICKSON . Dense point sets have sparse Delaunay tri-

ATOM N ARG

/ BAC / D of a potential

II.1 Space-filling Diagrams

obtained by growing every disk

and vertices in the boundary of a

kernel. It follows in particular that

We describe the same complex as a Voronoi diagram of

perboloid. Observe that for every point , the line

segment connecting and lies entirely in . In ge-

II.2 Power Diagrams FH* F . We have

, is sometimes referred to as the weight of the point .

circles are nested.

Orthospheres. Suppose for a moment that the balls

So there exists a subset

2. Number of arcs. Let be a set of disks in

dius of the zero-set of . We have

Figure III.1: A circle in is the zero-set of its weighted square

. Similarly, for a family of circles circles,

Let and be the corresponding tangent directions. By

, - , and the second derivative, 5 , , is normal to

F 5 , F . The normal

* * , is one over the radius of

III.3 Adaptive Meshing point , the restricted Voronoi cell is

. The construction is illustrated

for every triangle

center that passes through , , and has radius