Sie sind auf Seite 1von 19

C h a p t e r

5
Basic Concepts in Human
Molecular Genetics
Kara A. Mensink n W. Edward Highsmith Jr.

into five sections that review concepts intrinsic to molec-


INTRODUCTION ular genetics. Where possible, comments on the direct
Molecular diagnostics is the branch of laboratory medi- clinical application of these concepts have been
cine or clinical pathology that utilizes the techniques of incorporated. The first section focuses on the molecular
molecular biology to diagnose disease, predict disease structure of DNA, DNA transcription, and protein trans-
course, select treatments, and monitor the effectiveness lation. The second section focuses on molecular pathol-
of therapies. Molecular diagnostics is associated with vir- ogy, DNA replication, and DNA repair mechanisms. The
tually all clinical specialties and is a vital adjunct to sev- third section provides a basic overview of transmission
eral areas of clinical and laboratory medicine, but is genetics. The fourth section highlights the relationship
most predominantly aligned with infectious disease, between genes, proteins, and phenotype and includes
oncology, and genetics. The subject of this chapter is rationale for molecular genetic testing. The final section
molecular genetics, which is concerned with the analysis reviews allelic heterogeneity and corresponding choice
of human nucleic acids as they relate to disease. of analytical methodology.
Since the completion of the first working draft of the
human genome sequence in 2000 and the completion
of the polished sequence in 2003, progress in molecular
genetics has been swift and shows no signs of abating. MOLECULAR STRUCTURE OF DNA,
Relatively few gene tests were clinically available in the DNA TRANSCRIPTION, AND PROTEIN
late 1990s, whereas over 1,000 are available today. Fur-
TRANSLATION
ther, molecular genetic testing has proven useful and
robust enough to expand into population-based screen- The human genome is composed of 3 billion base pairs
ing. Molecular testing serves as the final confirmatory of DNA. This is not present as one continuous piece of
test for several disorders included as part of expanded double-stranded DNA, but is distributed among 22 pairs
newborn screening programs, and in 2003, the American of autosomal chromosomes and 2 sex chromosomes.
Colleges of Medical Genetics and Obstetrics and Gynecology The DNA is associated with a large number of proteins
recommended that population-based carrier screening (histones and others) that serve regulatory functions
for cystic fibrosis using molecular testing be implemen- and package the genetic material into these large chro-
ted in the United States. mosomal units. Chromosomes range in size from the
Molecular genetics as a discipline and as a clinical lab- 33.4 Mb of chromosome 22 to the 263 Mb of chromo-
oratory service does not exist in a vacuum. Rather, it is some 1, which is the largest chromosome. Along the
intimately tied to molecular and cell biology and the cen- length of each chromosome, DNA is organized into lin-
tral paradigm of molecular biology—that genes code for ear domains consisting of genes (primarily nonrepetitive
proteins. Thus, it is through the analysis of genes that DNA), repetitive elements, and apparently functionless
insight into the genesis of protein malfunction can be regions, much like beads on a string (Figure 5.1).
achieved. Such examination specifically entails an assess- Approximately half of the human genome consists of
ment of how the DNA sequence of a gene compares with repetitive DNA, while the other half consists of nonrepe-
its wild-type or normal sequence. Ultimately, protein titive sequence. Nonrepetitive DNA includes regulatory
malfunctions related to gene mutations lead to organ sequences, intronic sequence, and protein coding
dysfunction and disease states. This chapter will review (exon) sequence. Protein coding regions account for a
the fundamentals of molecular genetics, and is divided relatively small fraction of genes within the human

Molecular Pathology # 2009, Elsevier, Inc. All Rights Reserved. 89


Part II Concepts in Molecular Biology and Genetics

Organization of functional domains on a chromosome

Heterochromatin Gene Gene LINE VNTR Gene LINE Alu VNTR rDNA

Organization of a typical gene

Promoter Exon Intron Exon Intron Exon

5Untranslated Coding Coding Coding 3Untranslated


Region Sequence Sequence Sequence Region

Figure 5.1 Top: The functional domains of a chromosome; Bottom: Organizational structure of a typical gene.

genome. In fact, it is estimated that only 6% of the protein and is located on the X chromosome
human genome consists of protein coding, nonrepeti- (Xp21). Mutations in the DMD gene are associated
tive DNA. with X-linked recessive Duchenne muscular dystrophy.
Repetitive DNA can be subdivided into several dif- DMD measures 2.4 Mb (2,400,000 bp), consists of 79
ferent categories or families. Generally, repetitive exons, and takes at least 16 hours to transcribe. The
DNA tends to occur either in clusters of tandem size of a gene may influence the molecular diagnostic
repeats or as repetitive elements of various lengths dis- laboratory’s ability to design a clinical test for a partic-
persed throughout the genome. Clusters of tandem ular disorder and certainly impacts the selection of the
repeats can be localized to one or many locations. technology used to detect mutations.
Such clusters are commonly referred to as satellite Chemically, genes are composed of 2-deoxyribonucleic
DNA. For example, a-satellite DNA is a clustered repet- acid (DNA). DNA is a linear, nonbranching polymer
itive DNA sequence family that is localized to the cen- of nucleotides. Repeating ribose and phosphate subunits
tromeres of all human chromosomes. Repetitive form a backbone; and attached to each of the ribose
sequences that are not localized to a particular area moieties is a purine (adenine, guanine) or pyrimidine
or areas of the genome are referred to as dispersed (thymine or cytosine) base. Following standard nomencla-
repetitive elements. The Alu family and LINE families ture for the naming of ring containing compounds, the
are examples of dispersed repetitive elements. nitrogenous bases have their various carbon and hetero-
Repetitive DNA is sometimes referred to as junk DNA atom components numbered 1–6 (for the pyrimidines)
because it does not code for an apparently active RNA or 1–9 (for the purines) and the ribose positions are
transcript or functional protein. Although much remains indicated by numbers 10 –50 . The bases are attached to
to be discovered about the roles of so-called junk DNA, the ribose subunits at the 10 position of the sugar mole-
the label has been determined a misnomer. Appreciation cule. The ribose subunits are joined by phosphodiester
for the role of junk DNA in protein folding and localiza- linkages between the 50 position of one ribose to the
tion, DNA packaging and chromosome structure, and 30 position of the next (Figure 5.2). Thus, the molecule
regulation of gene expression is increasing. is not symmetrical and there is directionality implicit in a
Genes are found among the nonrepetitive DNA in DNA strand. There is a 50 end of a DNA strand and a
the genome. Genes code for specific protein chains, 30 end. Two DNA strands bind together to form the famil-
each with a specific function in cell physiology. A gene iar double helical structure of double-stranded DNA
is composed of regulatory elements, which determine (Figure 5.3). In order for a double helix to be stable, there
where, when, and how a gene is transcribed and must be a complementary base on the opposite strand
coding regions, which are broken into segments, for every base on a strand of DNA. The complementary
termed exons (expressed sequences). An example of pairs of bases are adenosine and thymine (A:T) and gua-
a regulatory element is the promoter, which is the site nine and cytosine (G:C). The two strands join in an anti-
where gene transcription is initiated. The exons are parallel fashion (one strand runs 50 to 30 and the other
separated by noncoding regions of DNA called introns 30 to 50 ). The ribose sugars form the scaffolding for the
(intervening sequences). The average gene is about complementary nitrogenous bases connected by hydro-
2.7 kb (2700 bp) of DNA in length. The smallest gene, gen bonds on the inside of the molecule. The DNA
H1A, is located on chromosome 6 and encodes a his- double helix is dynamic, and the weak hydrogen bon-
tone protein which functions, along with several other ding between complementary bases allows for the
histone proteins, to compact DNA in the cell nucleus. DNA strands to easily denature and reassociate with
H1A is 0.5 kb (500 bp) long and has no introns. One themselves. In the laboratory, the process of separating
of the largest genes, DMD, encodes the dystrophin (denaturing) double-stranded DNA and then allowing

90
Chapter 5 Basic Concepts in Human Molecular Genetics

5

Nucleotide PO4

CH2 O

Pyrimidine
base

PO4

CH2
O
Sugar-phosphate
backbone
Purine
base
O

3

Figure 5.2 Schematic view of nucleotide structure and how nucleotides join to form the DNA polymer.

the complementary single strands of DNA to reassociate Transcription of DNA


and return to a double-stranded configuration is called
hybridization. The basis of many of the laboratory techni- Transcription is the first process in the cascade of events
ques central to molecular diagnostics hinge on hybridiza- that lead from the genetic code contained in DNA to syn-
tion and the remarkable specificity of a nonrepetitive thesis of a specific protein. The product of gene tran-
sequence of bases that make up a single strand of DNA scription is ribonucleic acid (RNA). The structure of
to bind to its complementary sequence and no other. In RNA is similar to DNA, with three exceptions. First, the
vivo, the denaturing and reassociation of double stranded ribose sugar of RNA has two hydroxyl groups at the
DNA is inherent to the process of gene transcription. 20 and 30 carbons. Second, the base uracil (U) replaces
thymine (T). And third, most RNA molecules are single
rather than double stranded. There are four general
types of RNA (Table 5.1). Ultimately, the specific type
of RNA that results from the transcription of a structural
gene is messenger RNA (mRNA). Transcription of DNA

Table 5.1 Five Types of RNA

Type of RNA Summary


mRNA The transcript product of a structural
gene that encodes an amino acid
sequence.
tRNA Transfer RNA molecules recognize
codons of mRNA and facilitate
incorporation of each successive
amino acid during protein synthesis.
rRNA Integral component of the ribosomal
machinery used for translating DNA
transcript into protein.
Small RNA Many small RNA molecules exist, and
(example: each has different functions in RNA
snRNA) modification. For example, snRNA
assists with splicing intron transcripts
Figure 5.3 Schematic view of the double helical structure out of precursor mRNA.
of double stranded DNA. Blue ribbons represent the sugar- microRNA A newly described species of RNA
phosphate backbone. Green/yellow and pink/lavender links involved in gene regulation.
represent complementary purine/pyrimidine pairs.

91
Part II Concepts in Molecular Biology and Genetics

into RNA is catalyzed by RNA polymerase. RNA polymer- chain synthesis. To initiate the process of translating
ase consists of multiple subunits that work together to rec- mRNA into protein, the small ribosome subunit binds
ognize where the transcriptional complex should to mature mRNA at the CAP site and scans the mRNA
assemble, synthesize the RNA single-stranded transcript, sequence for its start codon, which is AUG. After the
and dissociate from the DNA template once synthesis is AUG codon is recognized, the large ribosome subunit
complete. Under the influence of the gene promoter, var- binds a specific aminoacyl-tRNA, Met-tRNA, and the
ious transcription factors are attracted to the upstream process of protein synthesis begins. An aminoacyl-
(50 ) end of the gene. The transcription factors recruit tRNA (referred to as a charged tRNA) is an RNA mol-
the RNA polymerase and initiate transcription of the cod- ecule with an anticodon complementary to the mRNA
ing region of a gene into RNA. Simultaneous reading of codon that carries a specific amino acid. The specific
the DNA template (antisense strand) and elongation of amino acid that each charged tRNA carries is deter-
the RNA product by the RNA polymerase complex pro- mined by the mRNA codon and is associated with the
ceeds in the 50 to 30 direction. Elongation ceases when tRNA anticodon. As the ribosome translocates itself
the RNA polymerase complex recognizes the DNA termi- along mRNA in a 50 to 30 direction, it catalyzes the suc-
nator sequence and disassociates from the primary mRNA cessive binding of charged tRNAs to their associated
transcript and the double-stranded DNA. The mRNA tran- mRNA codons. The ribosome catalyzes the chemical
script is complementary to the antisense strand and a rep- joining of amino acids together by creating peptide
licate (with the exception of uracil replacing thymidine) bonds between the amino and carboxyl groups of each
of the sense strand (Figure 5.4A). successively added amino acid (Figure 5.4C). It is this
Once the primary RNA sequence has been syn- flow of genetic information (DNA transcription to RNA
thesized, the RNA transcript requires modification and RNA translation to protein) that is termed the
for stability and translational efficiency. The primary central dogma (or paradigm) of molecular biology.
transcript (precursor mRNA or pre-mRNA) contains
both the coding (exon) and noncoding (intron)
sequences, and the intron material has to be removed
prior to translation and protein synthesis. Sequences
MOLECULAR PATHOLOGY AND DNA
flanking the exons, the donor and acceptor splice REPAIR MECHANISMS
sites, recruit a series of proteins that remove the
introns from the transcript and splice the exons
Mutation and Genetic Variation
together to form the mature mRNA (Figure 5.4B). There is no single sequence of the human genome.
Additional post-transcriptional modification includes Although the entire genome sequence from any given
the attachment of 7-methylguanosine CAP to the 50 human is approximately 99.9% identical to the genome
end of the mRNA and the addition of a polyA tail that sequence of any other individual human, there are on
consists of a variable number (usually 80–250) of ade- the order of 3 million sequence variations between any
nine nucleotides at the 30 end of the mRNA. Both two unrelated persons. It is the similarity of the genomes
the CAP and the polyA tail are thought to help stabi- between individuals that defines them as human beings,
lize the mRNA molecule, assist with its transport out and it is the differences that distinguish individuals.
of the nucleus into the cytoplasm, and may also help Although the majority of the sequence differences
to regulate translation of mRNA into protein. Once between individuals likely have no biological importance
splicing has occurred and the CAP and polyA tail have and do not contribute to physiological or observable
been added, RNA modification is complete and the differences, many clearly do have subtle effects and give
mature mRNA transcript is exported to the cytoplasm rise to the remarkable diversity of the human race.
for translation into protein (Figure 5.4B). A large number of genetic variations occur at mea-
surable frequencies in the population. Such variations
are termed polymorphisms. Although often used to
denote a nonpathogenic variation, the strict definition
Protein Translation of the term polymorphism is a variation that is present
After the mature mRNA transcript is transported to at a frequency of 1% or greater in the population. The
the cytoplasm, it is translated into protein by the ribo- most common type of sequence variation is a differ-
somes in the endoplasmic reticulum. The multiplicity ence between single nucleotides at a particular place
of function performed by the various units of ribo- in the genome, or locus. For example, at a certain posi-
somal machinery in concert to achieve translation of tion, one individual may have a thymine residue,
the DNA transcript (mRNA) and protein synthesis is whereas another may have a cytosine. This type of vari-
elegantly complex. Ribosomes consist of two multipro- ation is termed a single nucleotide polymorphism
tein subunits, each with an RNA component (rRNA) (SNP). To date, over 10 million different SNPs have
and several active centers. been characterized and are emerging as extremely use-
Recall that mature mRNA essentially represents only ful tools for understanding genetic diversity and loca-
the exonic or coding regions of a given gene. The base lizing disease genes. Another type of polymorphism
sequence within these coding regions is grouped into involves not the substitution of one nucleotide for
informational units of three bases, called codons. Each another, but variation in the number of copies of a
codon either codes for a specific amino acid or serves a string of nucleotides. One of the most common of this
regulatory function, such as stopping or starting protein type of variation is the variation of the number of

92
Sense strand
5 A G C A G T C A T T A T G G C G A A C C T T G G CT G C T G G A T G C T G G T T C T C 3
DNA
3 T C G T C A G T A A T A C C G C T T GG A A C C G A C G A C C T A C G A C C A A G AG 5

Antisense strand

RNA
Promoter polymerase DNA Terminator

Direction of transcription
5 end mRNA

A 5 end
Exon Exon Exon
Intron Intron
Precursor mRNA
Poly-A tail
CAP
AAAAAA
Introns excised
AAAAAA
Exons spliced togather
Mature mRNA AAAAAA

Nucleus
AA
AA

Cytoplasm
AA

Untranslated Start
B region codon Codon

mRNA 5 A G C A G U C A U U A UG G C G A A C C U U G G C U G C U G G A U G C U G G U U C U C AAAA 3

Messenger RNA
Anticodon

Met
Ala
Asn
Leu Human Prion Protein at pH 7.0
Protein Entrez structure: 1HJN

Figure 5.4 Panel A: Top- Two strands of DNA illustrating the complementary bases that link to form the double stranded
DNA molecule. Bottom- Schematic of DNA transcription with initiation at the promoter, elongation of the mRNA product
as the RNA polymerase translocates in the 50 to 30 direction, and termination at the termination codon. Panel B: Processing
of mRNA illustrating the addition of the CAP and poly-A tail, splicing out of intron sequences, and transport of the mature
mRNA molecule from the nucleus to the cytoplasm. Panel C: Using the human prion protein as an example, this schematic
illustrates the translation of mature mRNA to protein. Translation begins at the start codon (AUG) with protein synthesis in
the 50 to 30 direction. The structure of the charged tRNA molecule can be appreciated as each complementary tRNA anticodon
recognizes its corresponding mRNA codon. The final result is a representation of the folded human prion protein as referenced in
Entrez Structure http://www.ncbi.nim.nih.gov/sites/entrez?dp=structure (October, 2008).

93
Part II Concepts in Molecular Biology and Genetics

copies of a repetitive sequence at a given locus. When Other types of mutations that can occur include
the length of the repetitive unit is small (1 to tens of deletions and insertions of nucleotides in and sur-
nucleotides), this type of polymorphism is termed a sim- rounding coding regions. These types of mutations,
ple sequence repeat, or STR. When the length is longer, often abbreviated indels, can be small, from one to a
hundreds to thousands of nucleotides, they are termed few dozen bases, or large, covering large segments of
variable number of tandem repeats, or VNTRs. STRs chromosomes and including multiple genes. Since
are a very common source of genetic differences codons consist of a trio of bases, if a small indel occurs
between individuals and have been important in gene within the coding region of a protein which contains a
mapping studies. Currently, due to the high rate of het- number of bases that is divisible by three, the indel is
erozygosity, forensic laboratories utilize STR analysis said to be in-frame, as it will not shift the reading frame
extensively. Another type of polymorphism that has of the mRNA being translated into protein in the ribo-
recently become appreciated through the use of com- some. If, on the other hand, the number of bases is
parative genome hybridization microarrays (CGH not divisible by three, the indel will alter the reading
arrays) involves the deletions and duplications of frame of the protein and will typically result in the ribo-
regions of the genome. These regions can be quite some encountering a stop codon within a few dozen
large, up to several million bases in length, and may bases. Larger indels can involve whole exons, multiple
include genes. The role of these copy number variants exons, whole genes, or even multiple genes.
in human variation and disease is not yet understood. Abnormal expression of genes can also result from
A genetic mutation is a sequence variant that has a changes to the chemical structure of genes that are not
pathogenic effect. Some mutations are relatively com- a result of a change of the DNA sequence. Methylation
mon in the population, and meet the 1% population of the nucleotide bases is a postsynthetic modification
frequency criteria to be formally termed polymorphic; to DNA that affects the expression of genes. Abnormal
the cystic fibrosis mutation deltaF508 in the Northern patterns of DNA methylation can cause abnormal gene
European population and the sickle cell anemia muta- expression (transcription) and disease states. Repeat
tion in the African populations are examples. Some base sequences that have no apparent informational
mutations are very rare in the population. Not infre- content regarding protein structure exist throughout
quently, mutations are found that affect a single fam- the DNA. The expansion of the number of repeats in a
ily. These very rare mutations are termed private gene has been associated with specific diseases, and this
mutations. Pathogenic (disease-causing) mutations change in the gene structure is inheritable.
often involve changes in the base sequence that com-
poses a codon (or coding unit). However, mutations
can occur in regulatory elements such as splice sites DNA Replication
and promoter regions as well. A mutation occurring
in the portion of DNA that codes for a protein can DNA is synthesized as part of the DNA replication pro-
result in (i) a change of one amino acid to another, cess that occurs during the S phase of the mitotic cell
(ii) a change in an amino acid codon to one coding cycle and the first phase of meiosis. The DNA replication
for a termination signal (stop), or (iii) no change in process involves multiple specialized enzymes (Table 5.2)
the amino acid at that position. These types of changes that work together to synthesize two double-stranded
are termed missense, nonsense, and silent mutations, daughter strands from one double-stranded parent
respectively. A missense change can result in no strand. DNA polymerase synthesizes the daughter
change in the function of the protein, a total loss of strands. The replication fork is the site at which
function, a partial loss of function, or a change of
function. Partial or total loss of function usually results
in a pathological state, as does a change in function. Table 5.2 DNA Replication Enzymes
The pathogenic effect of a loss of function of a gene
product can be direct, such as the loss of chloride Replication
channel function that causes cystic fibrosis, or indirect, Enzymes Function
such as the loss of function of regulators of gene
expression that can result in cancers. Examples of Helicase Breaks hydrogen bonds linking the two
strands of the DNA double helix.
mutations that cause changes in function include Topoisomerase Mitigates the supercoiling effect that
those that cause constitutive activation of a function occurs in advance of the replication
that is normally under regulation by the cell. fork.
It is important to note that not all losses (even com- Single-strand Acts as a retractor, preventing the single
binding protein strands of the DNA double helix from
plete losses) of protein function lead to an abnormal (SSBP) rejoining.
phenotype or disease. For example, the common RNA primase Synthesizes the RNA primer that is
(10% allele frequency in the Caucasian population) required to initiate synthesis of the
32 base-pair deletion of the CCR5 gene (CC motif che- new daughter strands.
mokine receptor 5) has been associated with reduced DNA polymerase Synthesizes DNA daughter strands.
Certain DNA polymerases also act as
susceptibility to infection with HIV, but even homozy- part of the DNA repair machinery.
gotes (approximately 1% of the Caucasian popula- DNA ligase Links newly synthesized DNA fragments
tion), with no CCR5 protein, do not demonstrate any (Okazaki fragments).
observable effects on normal physiology.

94
Chapter 5 Basic Concepts in Human Molecular Genetics

double-stranded DNA is separated and DNA polymerase back together. There are three DNA excision repair
synthesizes the new daughter strands. Ahead of the repli- systems: mismatch repair (MMR), base excision repair
cation fork is the parent double-stranded DNA; behind (BER), and nucleotide excision repair (NER). Gener-
the replication fork are the newly synthesized daughter ally, these excision repair mechanisms can be distin-
strands. There are no DNA polymerases that synthesize guished by considering the context within which the
in the 30 to 50 direction. Thus, the DNA replication pro- error occurs, whether removal involves a base(s) or
cess is referred to as semidiscontinuous because, while nucleotide(s), and the number of bases(s)/nucleo-
one of each of the two new daughter strands is able to tide(s) removed. Each type of excision repair system
be replicated continuously in the 50 to 30 direction (lead- also invokes the use of unique proteins. For example,
ing strand), the other strand (lagging strand) must be the protein complexes that work to recognize DNA
copied in short 50 to 30 segments (Okazaki fragments) mismatch in the MMR mechanism (MSH2/MSH6
that are 100–1000 nucleotides in length. Okazaki frag- and MSH2/MSH3) are different than the protein com-
ments are joined together by the action of a ligase plexes (XPC-RAD23B and UV-DDB) that recognize
enzyme to complete the lagging strand. The fidelity of and invoke global genome nucleotide excision repair
the replication is estimated to approach 99.98%. In the (GG-NER).
rare case that DNA replication incorporates an incorrect
base, DNA proofreading and repair systems work to cor- Mismatch Repair of DNA Damage DNA replication
rect the error and prevent detrimental consequence. inaccuracy is the context within which the mismatch
repair (MMR) pathway preserves genomic integrity.
The primary purpose of MMR is to prevent mutations
DNA Repair accrued during the DNA replication process from pro-
In the broadest sense, DNA repair mechanisms work to pagating and becoming the start of a mutant lineage
correct, or in some way mitigate, the effects of DNA by recognizing and excising the mismatched nucleo-
replication inaccuracy and exogenous or endogenous tide, resynthesizing DNA, and ligating the broken strand
genetic insult. Generally, when the integrity of wild- back together. Germline mutations in genes coding
type DNA is compromised, the error is either cor- for MMR proteins Lynch syndrome.
rected, overlooked, or programmed cell death occurs.
A number of DNA repair pathways are known and Base Excision Repair of DNA Damage Base excision
can be roughly characterized into the following repair (BER) involves the excision of a single base
functional categories: (i) direct reversal, (ii) excision rather than the nucleotide and is most commonly used
repair, and (iii) DNA double-strand break repair. A to repair damage caused by endogenous DNA insult
brief description of each follows. Although they are and is especially important for cellular response to oxi-
often studied separately, it is impossible to completely dative DNA damage. BER involves removing the base
separate one from another because the various mecha- from the deoxyribose-phosphate chain by a specific
nisms are highly interconnected and act cooperatively glycosylase, endonuclease action, DNA polymerase
as part of a large cellular arsenal with the common Beta, and either DNA ligase I or DNA ligase III/
goal of genome integrity maintenance. XRCC1 complex.

Nucleotide Excision Repair of DNA Damage Nucleo-


Direct Reversal of DNA Damage tide excision repair (NER) is predominantly invoked
Correction of DNA damage by direct reversal is a type of in response to genomic damage caused by UV expo-
DNA repair that predominantly involves action by a sin- sure. NER involves the excision of an oligonucleotide,
gle enzyme repair system. Consider the enzymatic photo- rather than a single base (BER) or single nucleotide
reactivation (EPR) reaction that works to repair damage (MMR). It is also a substantially more complex process
induced by ultraviolet (UV) light. The formation of that includes at least 30 different proteins. Two sub-
pyrimidine dimers (most commonly thymidine) is one pathways of NER, termed global genome repair NER
type of pathologic cellular response to excess UV expo- (GG-NER) and transcription-coupled NER (TC-NER),
sure. When present, the bulky pyrimidine dimers impede have been recognized. Typically, GG-NER is used when
the DNA replication and transcription process. In a rela- errors occur in nontranscribed areas of the genome
tively simple light dependent reaction, DNA photolyase and TC-NER, as the name implies, corrects errors that
acts to restore the pyrimidines to their correct monomer occur in areas of active gene expression. Mutations in
conformation. Direct reversal by DNA photolyase is not NER genes are associated with disorders such as xero-
the only way the cell responds to UV damage. In fact, cel- derma pigmentosa (XP) and Cockayne syndrome.
lular response to UV damage also commonly involves
one or more of the excision repair mechanisms. DNA Double-Strand Repair of DNA Damage
DNA double-strand repair is an important DNA repair
Excision Repair of DNA Damage mechanism that uses a number of proteins, many of
Correction of DNA damage by excision repair involves which are similar to or the same as those used during
groups of proteins that act together to excise the meiotic recombination. DNA double-strand breaks
incorrect base(s) or nucleotide(s), replace them with (DSBs) can result from a number of exogenous
the correct sequence, and ligate the corrected strand and endogenous agents including ionizing radiation

95
Part II Concepts in Molecular Biology and Genetics

exposure, chemical exposure, and somatic DNA recom- autosomal recessive, X-linked recessive, X-linked dom-
bination or transposition events. Nonhomologous end- inant, and Y-linked (holandric). When each of these
joining (NHEJ) and homologous recombination (HR) inheritance patterns is represented in a pedigree dia-
are the two primary DNA double-strand repair mecha- gram, distinguishing features can be visually recog-
nisms. In addition to its role as a DNA repair mecha- nized (Table 5.3).
nism, NHEJ helps to maintain B-cell and T-cell
diversity and subsequently a healthy immune system, Autosomal Dominant Inheritance
by correcting intentional breaks created during V(D)J
recombination. Short homologous sequences (micro- Autosomal dominant inheritance is designated when
homologies) found on the single-stranded tails of the no difference in phenotypic expression is observed
broken DNA are used to help rejoin the strands in between heterozygous and homozygous genotypes.
NHEJ, whereas HR relies on homologous (or very close Visually, the autosomal dominant pedigree shows multi-
to homologous) sequence to repair the broken strands. ple affected generations in a vertical pattern, an equal
HR is typically used when DNA replication is halted due distribution of males and females affected, and both
to a single-strand break or another unrepaired lesion males and females transmit the phenotype (including
that causes collapse of the replication fork. Because it males transmitting the phenotype to other males).
uses a homologous or near homologous template, HR Typically, dominant disorders occur when a mutation
is often thought to be more accurate than its NHEJ confers an inappropriate activity on a gene product.
counterpart. However, both mechanisms show high Examples include Huntington disease and other poly-
accuracy, as well as imperfection. glutamine disorders where expansion of a triplet repeat
with in a polyglutamine tract causes cellular toxicity, or
familial amyloidosis where mutant transthyretin protein
MODES OF INHERITANCE is relatively unstable and deposits as amyloid in tissues.
However, some dominant disorders, such as those
A detailed family history provides the foundation for involved in the majority of the inherited cancer syn-
genetic diagnosis and risk assessment. Visually recorded dromes, occur with inheritance of a single copy of a
using standardized symbols and nomenclature [1], the gene where the mutant copy has not acquired a novel,
pedigree provides the tool by which inheritance patterns pathogenic function, but is inactivated. Further, the
are elucidated, and subsequent risk assessment is calcu- great majority of cells and tissues carrying single copies
lated. In addition to diagnosis and risk assessment, the of these mutant genes are functionally normal. The res-
pedigree is a powerful research tool, aiding in the discov- olution of this apparent paradox came when, after
ery of new genes and helping to better understand the observing familial cases of bilateral retinoblastoma
phenotypic expression of genes already discovered. (RB) and comparing those cases with sporadic (nonfa-
Observations made from controlled monohybrid and milial) cases of unilateral RB, Knudson proposed that
dihybrid crosses of peapod plants in the 1860s formed two hits or mutational events were needed for the initi-
the foundation for Gregor Mendel’s landmark laws of ation of tumor growth. In the case of familial RB, a
heredity that still govern basic pedigree interpretation germline mutation in one tumor suppressor allele was
today. Since that time, much has been learned and the postulated, and there was a much higher probability
study of inheritance is now far more complex than Men- of tumor initiation because the individual was born with
del himself may have imagined. This section of the chap- one hit (mutation). Therefore, somatic mutations that
ter reviews modes of inheritance and factors that may hit or render nonfunctional the remaining normal
influence pedigree interpretation. allele would be tumorigenic. In contrast, in a normal
individual (one not carrying a mutant RB gene in the
germline), that same somatic event would not lead to
Mendelian Inheritance the initiation of a tumor because one functional allele
The concepts that two copies of a gene segregate from would remain. Tumor initiation takes place in a normal
each other (law of segregation) and are transmitted individual only if two somatic events occur at the same
unaltered (particulate theory of inheritance) from par- locus. As tumor initiation is not observed with a single
ents to their offspring help to explain the concepts of abnormal allele, it is said that tumor suppressor genes
dominant and recessive traits. When the presence of act as recessive alleles at the cellular level, but as domi-
one copy of a particular allele results in phenotypic nant disorders at the organism level. Additional exam-
expression of a particular trait, the trait is dominant. ples of dominant disorders associated with inactivating
When two copies of a particular allele must be present mutations in tumor suppresser genes include Lynch
for the phenotypic expression of a trait, the trait is syndrome (inactivation of one of the mismatch repair
recessive. Note that it is the phenotypic expression that genes MLH1, MSH2, or MSH6) or familial breast cancer
is described as dominant or recessive, not the allele or (inactivation of BRCA1 or BRCA2).
gene itself. Thus, patterns of inheritance are distin-
guished by where the gene resides within the genome
(autosome or sex chromosome) and whether or not
Autosomal Recessive Inheritance
phenotypic expression occurs in the heterozygous or Autosomal recessive inheritance is designated when
homozygous state. Traditionally recognized Mendelian phenotypic expression is observed only when both
patterns of inheritance include autosomal dominant, copies of a gene are inactivated or mutated. Visually,

96
Chapter 5 Basic Concepts in Human Molecular Genetics

Table 5.3 Mendelian Inheritance Patterns

Inheritance Pattern Example Pedigree Clinical Example


Autosomal dominant Huntington disease
Myotonic dystrophy
Retinoblastoma
Lynch syndrome
Neurofibromatosis I
TTR associated-amyloidosis
and many others

Autosomal recessive Cystic fibrosis


Galactosemia
Autosomal recessive (AR) deafness
AR epidermolysis bullosa
Tay-Sachs disease
Klippel-Feil syndrome
and many others

X-Linked recessive Duchenne muscular dystrophy


Hemophilia A
X-linked ichthyosis
X-linked mental retardation
Opitz syndrome
Emery-Dreifuss muscular dystrophy
And many others

X-Linked dominant Vitamin D-resistant rickets


Coffin-Lowry syndrome
and others

Y-Linked (Holandric) Hairy ears


Y-linked deafness
Very few others

97
Part II Concepts in Molecular Biology and Genetics

the autosomal recessive pedigree typically shows a hor- disease. The daughter of Princess Alice, Princess Alix,
izontal pattern where multiple affected individuals can was married to Tsar Nicholas II of Russia and the
be observed within the same sibship, and an equal mother of the affected Tsarovich Alexei. The current
number of males and females are affected. In instances royal family, the House of Windsor, is descended from
of autosomal recessive inheritance, each parent of an Queen Victoria through an unaffected male, King
affected individual has a heterozygous genotype com- Edward VII, so that branch of the family does not carry
posed of one copy of the mutated gene and one copy the hemophilia A mutation.
of the normal/functional gene. When a pedigree is
analyzed, individuals who must be genetic carriers of
the disorder in question, such as parents of an affected
X-Linked Dominant Inheritance
child, are termed obligate carriers. Other individuals X-linked dominant inheritance is designated when phe-
in the pedigree may be at risk for being carriers. The notypic expression is observed predominantly in
risk to be a carrier is defined by each individual’s posi- females (ratio of about 2:1) and all daughters of
tion in the pedigree relative to affected individuals, or affected males are affected and none of the sons of
known carriers. For example, a sibling of an obligate affected males are affected. Visually, the pedigree typi-
carrier of sickle cell anemia has a 50% probability of cally shows a vertical pattern of affected individuals,
being a sickle cell carrier, while a first cousin of a cystic with no instance of direct male-to-male transmission.
fibrosis patient has a 25% chance of being a carrier. X-linked dominant conditions are substantially less
Typically in recessive disorders, having only a single common than X-linked recessive disorders. An example
copy of a mutant gene is insufficient for manifestation is X-linked, vitamin D-resistant rickets, which is caused
of disease. Alternatively stated, one copy is enough for by mutations in the PHEX (phosphate-regulating endo-
normal cellular and tissue function. Examples of recessive peptidase homolog, X-linked) gene located at Xp22.
disorders include enzyme deficiencies, such as galacto-
semia (galactose-1-phosphate uridyl transferase) or phe-
nylketonuria (phenylalanine hydroxylase), or deficiency
X-Linked Dominant Male Lethal Inheritance
of transport proteins, such as cystic fibrosis (CFTR). Con- X-linked dominant male lethal inheritance is designated
sanguinous mating or mating between related individuals when phenotypic expression is observed only in females.
increases the risk of autosomal recessive phenotypic Visually, the pedigree typically shows a vertical pattern
expression for certain genes because the proportion of with an increased rate of spontaneous abortion and
shared genes among offspring is increased. where approximately 50% of the daughters from
affected mothers are also affected. Although the great
majority of cases of Rett syndrome are not familial (but
X-Linked Recessive Inheritance sporadic), familial cases have been described and would
X-linked recessive inheritance is designated when phe- be classified as an X-linked dominant male lethal disor-
notypic expression is observed predominantly in males der. Rett syndrome is caused by mutations in the MeCP2
of unaffected, heterozygous mothers. All female off- gene (methyl CpG binding protein 2). Rett syndrome
spring of affected males are obligate carriers. Visually, is a neurodevelopmental disorder characterized by
the pedigree typically shows a horizontal pattern of arrested development between 6 and 18 months of age,
affected individuals with no instance of direct male-to- regression of acquired skills, loss of speech, stereotypical
male transmission. However, males may transmit the dis- hand movements, microcephaly, seizures, and mental
order to a grandson through carrier female daughters. retardation. Affected males rarely survive to term, and
It is not uncommon for X-linked recessive disorders the majority of affected females do not reproduce.
to appear in a family such that before a certain genera-
tion the disease is not apparent, but is observed to
be segregating in the family after that generation. This
Y-Linked or Holandric Inheritance
phenomenon is due to new mutations appearing de novo Y-linked or Holandric inheritance is designated when
in an individual. This was explained by the American phenotypic expression is observed only in males with
geneticist Haldane, and his theory is referred to as the a Y chromosome. Visually, the pedigree shows only
Haldane hypothesis. If the reproductive fitness of a male male-to-male transmission. Hairy ears are an example
affected with an X-linked recessive disorder is low or nil, of a Y-linked trait. Few disease states have been shown
then in a population one-third of all affected X chro- to be Y-linked. However, there is one report of a multi-
mosomes will be removed from the gene pool every gen- generational Chinese family with Y-linked deafness.
eration. An example of decreased reproductive fitness
among males is Duchenne muscular dystrophy. If the
incidence of the disease is constant, then one-third of Non-Mendelian Inheritance
cases must be due to mutations arising de novo in a family.
No doubt the most famous family to be afflicted
Epigenetic Inheritance—Imprinting
with an X-linked recessive condition is the House of When the phenotypic expression of a gene is essen-
Saxe-Coburg and Gotha, the British Royal family. tially silenced dependent on the gender of the trans-
Queen Victoria, apparently a carrier of a new hemo- mitting parent, the gene is referred to as imprinted.
philia A mutation, had one affected son, Prince Leo- The phenomenon of imprinting renders the affected
pold, and two daughters who were carriers of the genes functionally haploid. In the case of imprinted

98
Chapter 5 Basic Concepts in Human Molecular Genetics

genes, the functional haploid state disadvantages the genes exacerbates and/or detracts from a particular
imprinted gene because the gene is more susceptible clinical phenotype.
to adverse effects of uniparental disomy, recessive
mutations, and epigenetic (like DNA methylation- Sporadic Inheritance
dependent gene silencing) defects. Visually, pedigrees
that represent imprinting may appear similar to auto- Sporadic inheritance, where only one isolated case
somal recessive or sporadic pedigrees and show a hori- occurs within a family, is the most common pedigree pat-
zontal pattern. Imprinting disorders may also appear tern observed in clinical practice. Chromosomal
autosomal dominant and show a grandparental effect abnormalities and new dominant mutations typically
in the case of imprinting center mutations. Males demonstrate sporadic inheritance. It is easy to imagine
and females are equally affected, and transmission is how autosomal recessive and X-linked recessive disor-
dependent on the gender of a parent. The two most ders can often appear sporadic, especially in situations
well-known imprinting disorders are Prader-Willi and where family size is small or clinical knowledge about
Angelman syndrome. Prader-Willi syndrome is caused extended family is limited. Thus, both Mendelian and
by an absence of paternally contributed 15q11–13 non-Mendelian explanations for sporadic inheritance,
(PWS/AS) region, whereas Angelman syndrome is each with its own recurrence risks, can apply. As a result,
caused by an absence of maternal contribution at the clinicians tend to refer to isolated cases as apparently
same locus. In the case of PWS, lack of paternally con- sporadic rather than absolutely sporadic. Since nonin-
tributed genes at 15q11–13 (regardless of mechanism) herited disorders are associated with virtually negligible
results in unmethylated and overexpressed genes in recurrence risk as compared to those exhibiting Mende-
this region. The same is true for Angelman syndrome. lian inheritance, those associated with chromosomal
However, it is a lack of maternally contributed genes abnormalities, and those associated with new dominant
that causes the phenotype in this instance. The clinical mutations, it is important to make every effort to distin-
phenotype of each disorder is distinct, but both are guish apparently sporadic cases from truly sporadic ones.
associated with mental retardation. However, it is often not possible to make this determina-
tion and recurrence risk can be narrowed only to a broad
Inheritance Through Mitochondrial DNA range encompassing all possibilities.

The inheritance of mitochondrial disease is compli-


cated by the fact that mitochondrial disease can be Differences in Phenotypic Expression Can
either the result of mutations in nuclear DNA (nDNA) Complicate Pedigree Analysis
and thereby subject to the Mendelian forms of inheri-
tance described previously or the result of mutations The occurrence of reduced penetrance, variable
in organelle-specific mitochondrial DNA (mtDNA). expressivity, anticipation, and gender influence or lim-
Since the mitochondrial genome is maternally inher- itation can confound pedigree analysis. The clinical
ited, pedigrees demonstrating mitochondrial inheri- subtlety and nuance associated with each phenome-
tance show an affected mother with all of her non can impact recognition of the correct inheritance
offspring (male and female) affected. The common pattern (usually autosomal dominant, but not always)
phenomenon of heteroplasmy, where mtDNA muta- and result in overlooked or even incorrect diagnoses.
tions are present in only a portion of the mitochondria Further, accurate recurrence risk is dependent on cor-
within a cell, can make laboratory analysis and clinical rect diagnosis and pedigree assessment.
assessment difficult. It is estimated that only 10–25% of
all mitochondrial disease is the result of maternally Genetic Penetrance
inherited mutations in the mitochondrial genome.
Therefore, mitochondrial disease should not always The penetrance of a genetic disorder is measured by
be equated with mitochondrial inheritance. evaluating how often a particular phenotype occurs
given a particular genotype or vice versa. Some disorders
show 100% penetrance, where all individuals with a par-
Multifactorial Inheritance ticular genotype express disease, while others show
Sorting out whether a particular phenotype is predom- reduced penetrance, such that a proportion of indivi-
inantly the result of inherited genetic variation, envi- duals with a particular genotype never develop any fea-
ronmental influence, or some combination therein tures (even mild) of the associated clinical phenotype.
can be difficult. When the combined effects of both Thus, penetrance is the probability that any phenotypic
inherited and environmental factors cause disease, effects resulting from a particular genotype will occur.
the disorder is said to exhibit multifactorial inheri- Certain factors are known to influence the gene pene-
tance. Multifactorial inheritance is associated with trance for specific disorders. For example, phenotypic
most, if not all cases of complex, common disease expression of a particular phenotype may be modified
(cancer, heart disease, asthma, autism, mental illness, by age, termed age-related penetrance. Sometimes, as
and others). Typically, multiple loci or multiple genes age increases, penetrance increases. For example, only
are associated with the same complex disease pheno- 25% of individuals with a specific Huntington disease
type. Such genetic heterogeneity works additively, such genotype (41 repeats) exhibit symptoms at age 50, while
that the net effect of multiple mutations in multiple 75% exhibit symptoms at age 65. Although less common,

99
Part II Concepts in Molecular Biology and Genetics

penetrance can also decrease with age. Gender-related a high degree of variable expressivity observed within
penetrance has been observed in cases of hereditary family members that carry the same NF1 mutation. Some
hemochromatosis where some females with a particular affected individuals with the same mutation may show
HH genotype show no evidence of iron accumulation only a few café-au-lait macules of the skin, while others
in contrast with their affected male siblings who are may be more severely affected with large invasive plexi-
known to have the identical genotype. Reduced pene- form neurofibromas and hundreds of cutaneous and
trance can sometimes obscure an autosomal dominant subcutaneous neurofibromas.
inheritance pattern because, while some family mem-
bers may have affected offspring, they themselves are Pleiotropy
not affected due to reduced penetrance of the disorder.
Pleiotropy refers to disorders where multiple, seem-
ingly unrelated organ systems are affected. For exam-
Sex-Influenced Disorders ple, one individual in a pedigree may exhibit cardiac
Sex-influenced disorders are disorders that demon- arrhythmia, whereas another individual with the same
strate gender-related penetrance. When the probabil- disorder in either the same or different pedigree
ity of phenotypic expression is more likely given shows muscle weakness and deafness. Since the mani-
a specific gender, the disorder is said to be sex- festations of disease are so vastly and usually inexplica-
influenced. BRCA2-related hereditary breast/ovarian bly different, disorders that show a high degree of
cancer (HBOC) and APOE4-associated late onset fa- pleiotropy are often difficult to diagnose. As a group,
milial Alzheimer disease are sex-influenced disorders. mitochondrial disorders typically show a high degree
BRCA2-related HBOC is an autosomal dominant disor- of pleiotropy, as any organ system can be affected, to
der associated most predominantly with increased risk almost any degree, with any age of onset.
for breast and/or ovarian cancer. Although less com-
mon than breast or ovarian cancer, BRCA2 carriers Anticipation
may also be at increased risk for several other cancers
including neoplasms of the skin, prostate, pancreas, A disorder shows anticipation when an earlier age of
larynx, esophagus, colon, stomach, gallbladder, bile onset or increased disease severity occurs in successive
duct, and hematopoietic system. In cases of HBOC generations. Anticipation is predominantly associated
caused by BRCA2 mutation, about 6% of males as with neurodegenerative trinucleotide repeat disorders
opposed to 86% of females are expected to develop (spinocerebellar ataxias, Huntington disease, myotonic
breast cancer by age 70. With respect to APOE4-asso- dystrophy, etc). In such cases, the number of trinucleo-
ciated late onset familial Alzheimer disease, women tide repeats expands through generations, and is corre-
who are heterozygous for APOE4 alleles are at 2-fold lated with severity of disease and age of onset. However,
increased risk to develop late onset familial Alzheimer not all disorders that exhibit anticipation are trinucleo-
disease as compared to males with the same genotype. tide repeat disorders. Dyskeratosis congenita-Scoggins
type characterized by nail dystrophy, skin hyperpig-
mentation, and mucosal leukoplakia shows anticipation
Sex-Limited Disorders via a mechanism of progressive telomere shortening in
Sex-limited disorders refer to autosomal disorders that successive generations. While the mechanism remains
are nonpenetrant for a particular gender. Male limited unclear, anticipation is observed in families with a
precocious puberty is one example. Males heterozy- specific TTR gene mutation (V30M) associated with
gous for mutations in the LCGR gene located on chro- amyloidosis.
mosome 2 exhibit this phenotype, but females with the
same genotype do not. Very few sex-limited disorders
have been documented. Other Factors That Complicate Pedigree
Analysis
Variable Expressivity Genetic Mosaicism
Variable expressivity refers to the difference in severity Mosaicism occurs when two or more genetically distinct
of disease among affected individuals, both between cell lines are derived from a single zygote. The timing
related and unrelated individuals. It is important to note of the post-zygotic event(s) and tissues involved deter-
that even between related individuals (with the same mine the clinical consequence and help to distinguish
genotype) variable expressivity occurs. Variable expres- one type of mosaicism from another. Gonosomal mosai-
sivity is distinct from penetrance because it implies a cism occurs early in embryonic development and is more
degree of affectedness, not whether or not the individual likely to involve gonadal tissue and result in phenotypic
is affected at all. The majority of inherited disease expression. The clinical effects are often milder for
demonstrates some degree of variable expressivity. Vari- mosaic individuals where only a proportion of cells carry
able expressivity can complicate pedigree analysis a particular mutation, as compared to those who inherit
because individuals with subtle clinical manifestations germline mutations where all cells are affected. When
can be mistaken for unaffected individuals. Neurofibro- mosaicism is confined to gonadal tissue, there are usually
matosis type I is an autosomal dominant neurocuta- no clinical consequences to the gonadal mosaic individ-
neous disorder that affects 1/3000 individuals. There is ual. However, such individuals are at higher risk for

100
Chapter 5 Basic Concepts in Human Molecular Genetics

having affected offspring. Thus, since gonosomal mosaic paid to establishing trust, navigating social relationships,
parents have some proportion of mutant germ cells, they educating, and communicating effectively. Family his-
can (and do) have nonmosaic, affected offspring. There tories are deeply personal, and the psychosocial impact
is no practical way to exclude the possibility of gonadal of the required informational gathering can be signifi-
mosaicism or effectively test for it. This can cause a cant. Further, inaccurate information can result in mis-
dilemma with respect to providing accurate recurrence interpretation and ultimately misdiagnosis. Whenever
risks to families. Gonadal mosaicism has been found to possible, reported diagnoses must be confirmed with
be more common for certain disorders and some medical records. The effort this requires should not be
empiric risk estimates have been determined. For exam- minimized, as privacy and confidentiality must be
ple, Duchenne muscular dystrophy, an X-linked disor- upheld for all family members throughout the process.
der, has an empiric risk for gonadal mosaicism of Clinical molecular genetics seeks to identify ge-
10–30%. This means that even when a mother of an netic variation and to determine whether or not the
affected boy tests negative for a DMD gene mutation, a observed genetic variation has a phenotypic effect.
male who inherits the same X chromosome as an Certainly, the latter cannot be accomplished without
affected sibling has a 10–30% chance of being affected. astute and thorough clinical evaluation and family his-
tory. Even an apparently negative family history is an
Consanguinity important one that can guide test selection and result
interpretation. In addition, the impact of pedigree
Consanguinity is both a social and genetic concept. analysis on genomic research is formidable. As a result
Generally, it refers to marriage or a reproductive rela- of detailed pedigree assessment, numerous genes have
tionship between two closely related individuals. The been discovered, genotype:phenotype correlations elu-
degree of relatedness between two individuals defines cidated, natural history knowledge obtained, and cer-
the proportion of genes shared between them. The off- tainly inheritance patterns revealed.
spring of consanguineous couples are at increased risk
for autosomal recessive disorders due to their increased
risk for homozygosity by descent. A frequent way that CENTRAL DOGMA AND RATIONALE FOR
consanguinity can complicate pedigree analysis is when
a provider is unaware of consanguinity at the time they GENETIC TESTING
are evaluating the pedigree and what appears to be an The clinical relevance of molecular genetics is funda-
autosomal dominant inheritance pattern is associated mentally rooted in the central paradigm of molecular
with an autosomal recessive disease phenotype. biology: genes encode proteins. Genes are the blueprint
for the proteins that form the macromolecules of cellu-
Preferential Marriage Between Affected lar structure and function. Cells, their respective func-
tions, and the interactions between them translate to
Individuals
the observable characteristics, or clinical phenotype, of
Increased reproductive risk can be the result of prefer- an organism. Endogenous and exogenous molecular,
ential marriage between affected individuals. It is not cellular, and organismal environments also play an
uncommon for similarly affected individuals to attend important role in influencing clinical phenotype. So,
the same school (for example, deaf high schools) or the expression of DNA at the molecular level coupled
make connections at support groups (for example, Lit- with environmental effects leads to more tangible mor-
tle People of America). And a proportion of such relation- phological and physiological traits at the level of the
ships may develop such that an affected couple may organism. However, organisms do not exist in isolation.
decide to start a family. Such selective mating can Each organism functions as part of a population within
increase the likelihood of pseudodominance within a a larger species and external environment. A species,
pedigree because the mating environment is selected and the organisms within it, is subject to evolutionary
such that an autosomal recessive disorder appears more forces, including natural selection, genetic drift, and
frequently than expected. An increased risk for autoso- gene flow. Such forces ultimately impose, overlook,
mal dominant disorders is also present. In cases where propagate, or extinguish genetic variation. The dynamic
both reproductive partners are affected with the same relationships between genetic variation, proteins, cells,
autosomal dominant condition recurrence risk ranges organisms, populations, and environment(s) connect
from 66% (when homozygous dominant inheritance genetic laboratories to clinical practice, as evaluating for
is not compatible with life) to 75%. genetic variation (molecular genetics, cytogenetics)
and/or its biochemical consequence (biochemical gene-
tics) provides an explanation and/or causative evidence
Other Considerations for Pedigree for clinical phenotype and diagnosis.
Construction and Interpretation
Pedigree construction requires equal amounts of skill,
science, and art. Those obtaining them must have a
Diagnostic and Predictive Molecular Testing
strong base of medical genetic knowledge, so as to know The clinical applications of molecular genetic testing
the important questions to ask and construct the pedi- can be generalized into two groups based on whether
gree correctly. In addition, careful attention must be the clinical information sought is intended for

101
Part II Concepts in Molecular Biology and Genetics

diagnostic or predictive purposes. Occasionally overlap alleles in the Northern European population, a nega-
between diagnostic and predictive testing occurs. tive screen result decreases that risk slightly over
Although most commonly performed for the purpose 10-fold, to 1/265. It is important to recognize that
of diagnosing a disorder in a symptomatic individual, while many predictive molecular screening tests are
diagnostic testing can also be informative for presymp- focused on evaluating at-risk individuals for autosomal
tomatic at-risk individuals. The degree of gene pene- recessive carrier status, other subgroups of predictive
trance must be known in order for this diagnostic yet screens help to distinguish germline from somatic dis-
predictive testing to impart clinical value. Penetrance ease or revise prognosis or risk related to complex
does not necessarily have to be 100% to be useful to disease based on presence or absence of disease-
the patient and/or family; for example, BRCA1 muta- associated SNPs. For example, colon cancer is typically
tions are associated with a lifetime risk of approxi- a sporadic disease that has a genetic component but
mately 60–80% for the development of breast cancer does not typically follow a simple Mendelian inheri-
(Table 5.4). tance pattern of a single-gene disorder. However, a
The molecular genetic test for Huntington disease small fraction, approximately 5% of cases, do indeed
(HD) illustrates how the same test can be used to follow a Mendelian inheritance pattern and are due
determine diagnosis for affected individuals and to to inherited mutations in single genes. It is clear that
predict affected status for as yet unaffected individuals. identification of these families can have enormous
If testing is performed on a 25-year-old asymptomatic importance for family members because individuals
individual known to be at 50% risk for HD and the who are shown to carry the familial mutation can
result is consistent with a repeat expansion known to greatly benefit from enhanced monitoring and pro-
be fully penetrant by age 47, the result of predictive phylactic measures. Similarly, family members that
testing is consistent with a diagnosis of HD during are shown not to carry the familial mutation are freed
the presymptomatic period. Predictive diagnostic test- from the need for intensive monitoring and are
ing can be more difficult to interpret in cases where returned to the same risk as the general population
gene penetrance is not so absolute. For example, if for the development of colon cancer. Unfortunately,
the same 25-year-old asymptomatic individual at 50% simple pedigree analysis is seldom sufficient to identify
risk for HD was found to have an expansion mutation such families because colon cancer is not a particularly
in the reduced penetrance range (36–39 repeats), the rare condition, and it is not uncommon for multiple
ultimate diagnosis is not so absolute. family members, who often share many environmental
The second broad group of molecular genetic tests risk factors, to develop sporadic colon cancer. In addi-
includes those performed for the purpose of revising tion, the penetrance of the disorder may not be com-
an already known risk. Predictive molecular testing plete, thereby making it more difficult to recognize a
typically employs molecular screening tests to more specific inheritance pattern. Some inherited colon
accurately determine the individual and familial/ cancer syndromes, such as familial adenomatous poly-
reproductive risks for an individual that is already a posis colon cancer (FAP) have a distinctive phenotype
member of a high-risk population. Typically, a targeted (many thousands of colonic polyps) so ascertainment
mutation analysis method is used. For example, for of families is usually straightforward. However, other
Caucasian individuals of Northern European ancestry syndromes, such as Lynch syndrome (or HNPCC),
and no family history of cystic fibrosis (CF), the risk which is due to mutations in the MMR pathway, cannot
for being a heterozygous carrier is 1/25. After such be distinguished from sporadic colon cancer using
an individual is screened for the 23 mutations in the clinical or pathological criteria. One might investigate
CFTR gene that account for approximately 90% of CF all of the relevant MMR genes in cases suggestive of
Lynch syndrome, early age of onset (<50 years), or
familial clustering, but such testing (which could
involve whole gene sequencing and deletion analysis
Table 5.4 Disease Associated Penetrance for up to three large genes) could be very expensive.
for Common Hereditary Cancer Further, selection of potential cases on clinical
grounds and family history often has a relatively low
Syndromes yield. Because tumors from Lynch syndrome patients
are defective in MMR, they exhibit a type of genomic
Familial Cancer Lifetime instability known as microsatellite instability. Since
Syndrome Gene Penetrance approximately 30% of sporadic tumors have microsat-
ellite instability, screening tumor specimens from indi-
Hereditary Breast/ BRCA1 60–80%
Ovarian Cancer
viduals at risk for Lynch syndrome for loss of
Hereditary Breast/ BRCA2 60–80% expression for mismatch repair proteins and microsat-
Ovarian Cancer ellite stability can eliminate approximately 70% of
Retinoblastoma RB1 >99% colon cancer patients from a Lynch syndrome diagnos-
Familial Adenomatous APC >99% by age 40 tic algorithm and greatly reduce cost. If microsatellite
Polyposis
Lynch syndrome MLH1, MSH2, 75% (may be instability is present in a tumor from an at-risk individ-
MSH6 slightly lower ual, germline testing can subsequently be performed.
in females) If microsatellite instability is not present, suspicion
for an underlying germline defect is low.

102
Chapter 5 Basic Concepts in Human Molecular Genetics

Benefits of Molecular Testing Both siblings will likely experience psychological


repercussions of testing. Such psychological risks must
Psychosocial benefits of a confirmed molecular genetic be discussed before sample collection and may influ-
diagnosis may include (i) reduced anxiety associated ence an individual’s decision to undergo testing.
with a known versus unknown diagnosis, (ii) reduced Financial risks may include inability to obtain life
anxiety if the diagnosis confirmed is considered by the insurance or certain types of health insurance should
patient to be less severe among those being considered an individual test positive. However, the newly enacted
for patient, (iii) reduced anxiety associated with a cease Genetic Information Nondiscrimination Act (GINA)
in the diagnostic odyssey that many patients with rare dis- should reduce these risks. Many insurance plans do
orders experience (multiple medical consults, proce- not cover the costs of genetic testing or will cover only
dures, and laboratory tests associated with a continued part of the expense. Given that many molecular
search for a diagnosis). In addition, psychosocial bene- genetic tests are expensive due to the high costs of
fits may accrue from implementation of a more in- the technology used and the highly skilled personnel
dividualized and, in some cases, preventive medical required to process and interpret the sample, personal
management approach. For individuals undergoing pre- financial cost to the patient can be substantial.
symptomatic testing, benefits may also include a sense of Limitations of molecular genetic testing should also
empowerment, regardless of their test result and a sense be discussed with the patient as part of the informed
of relief if they test negative. Knowledge of one’s risk for consent process prior to sample collection. Molecular
having children with a genetic disorder also assists with genetic testing is often misunderstood as perfectly
family planning, with individuals and couples being able decisive. While new technologies and detection rates
to access genetic counseling and prenatal diagnosis. are continuously improving, not all mutations are
Clinical benefits of a molecular diagnosis often identified such that the risk for a false negative result
include the ability for the care provider to recommend is always a possibility. An even more common problem
a preventive medicine and treatment plan based on the involves the identification of alterations whose medical
known natural history of a particular disorder. Geno- or functional significance is not clear. These genetic
type:phenotype correlations have been established for alterations are termed variants of uncertain signifi-
some particular mutations/disorders, such that a more cance and can be especially complicated to interpret.
individualized medical approach to care with respect Although rare, laboratory errors such as performing
to severity of disease, expected age of onset for pre- the wrong test or mislabeling samples can also occur.
symptomatic cases, and increased risk for certain asso- Disease and test-specific limitations of molecular test-
ciated complications can be determined. So, the ing are truly method, disease, and case specific, and
genotype result may give care providers and their it would be impossible to address each of them here.
patients information that could lead to more individua-
lized medical management. Also, when a molecular
diagnosis is confirmed, predictive testing options
become available for at-risk family members. Consider
Considerations for Selection of
an individual at 50% risk for FAP undergoing presymp- a Molecular Test
tomatic testing. Identification of the causative mutation Selecting an appropriate molecular genetic test is
for this individual allows for early intervention by dependent on the purpose for testing, the clinical
screening and prophylactic colectomy, and informative information known, the sample(s) and testing meth-
presymptomatic testing option for at-risk family mem- ods available, and the clinical information sought.
bers. A negative test result directs implementation of a Molecular screening tests usually involve methods that
more appropriate, less aggressive screening strategy. investigate for common mutations (for example, tar-
geted mutation detection by RFLP), whereas diagnos-
Risks Associated with Molecular Testing tic testing methods are typically more comprehensive
(for example, DNA sequencing). The molecular meth-
Risks and limitations of genetic testing should always ods used for the purpose of revising a known risk can
be reviewed and openly discussed with patients as part sometimes be the same, but are often different than
of the informed consent process prior to testing. Risks those used for diagnostic purposes. When evaluating
associated with molecular genetic testing are most the method to be used, the expected detection rate
often psychological and financial. Limitations of mol- for individuals that are classically affected with the dis-
ecular genetic testing are usually related to confoun- order in question and the clinical context of the
ding results, interpretive restrictions, or imperfections patient being tested should be considered. To maxi-
of the method used. mize the informative value of presymptomatic testing,
Although it can be of profound benefit, the knowl- in most cases, one must know the familial mutation(s).
edge of a molecular genetic test result can also be a Practically, this translates into the necessity that an
risk, regardless of whether the result confirms the pres- affected individual should be tested before presym-
ence of disease. A positive test result can be devastat- ptomatic testing is performed on at-risk family mem-
ing, and a true negative result can sometimes invoke bers. Preferred testing algorithms developed by expert
survivor guilt. Consider two siblings who undergo pre- clinicians and laboratorians are especially useful,
symptomatic testing where one sibling tests positive for though ultimately each clinical situation is different
a life-threatening disorder and the other tests negative. and should be considered within its own unique

103
Part II Concepts in Molecular Biology and Genetics

context. An increasing number of molecular genetics base sequence at a particular locus, and quantitative
laboratory directors are employing genetic counselors methods, in which PCR-based techniques are used to
that act as a liaison between the ordering provider and quantify specific nucleic acid sequences. Mutation
the laboratory to serve as a resource for the identifica- detection strategies can be further grouped into spe-
tion of case-specific benefits, risks, and limitations to cific or scanning techniques.
testing, as well as to assist with test selection, case coordi-
nation, and interpretation of results. Specific Mutation Detection
Specific mutation detection entails straightforward, and
ALLELIC HETEROGENEITY AND CHOICE largely routine, procedures that can be used to analyze
DNA samples for previously identified mutations using
OF ANALYTICAL METHODOLOGY an assay designed for maximum specificity. This
The great majority of analyses performed in the clini- approach targets known mutations in potentially large
cal molecular genetics laboratory are based on the cohorts of patients or small panels of specific mutations
polymerase chain reaction (PCR). PCR is a technique, in disorders characterized by one or a few common
developed by Kary Mullis in 1984 (then at Cetus alleles. Results from these types of analyses may confirm
Corp.), for the rapid, in vitro amplification of specific or establish clinical diagnoses. Furthermore, in families
DNA sequences. The rapid introduction of PCR into at risk for a particular genetic disease, specific or targeted
research and later into clinical laboratory practice has mutation detection allows for rapid screening of an
revolutionized the practice of molecular biology. In entire family for the mutation identified in the proband
1993, Dr. Mullis was awarded the Nobel Prize in Chem- (the first member of a family to be diagnosed with a
istry for his achievement. genetic disorder), thereby permitting accurate carrier
Knowledge of the sequence of the region of DNA determinations that may aid reproductive decisions.
flanking the area of interest is required for PCR. Two Rapid testing of large numbers of patients permits an
synthetic oligodeoxynucleotides (primers), typically 20 assessment of the frequency of a mutation among
to 30 bases in length, are prepared (or purchased) such disease-causing alleles, thereby determining which muta-
that one of the primers is complementary to an area on tions are most prevalent in different patient populations
one strand of the target DNA 50 to the sequences to be and guiding the creation of effective clinical mutation
amplified, and the other primer is complementary to testing panels. Examples of genetic disorders that are
the opposite strand of the target DNA, again 50 to the characterized by low allelic heterogeneity and are most
region to be amplified. To perform the amplification, often investigated using specific mutation detection
one places the sample DNA in a tube along with a large methods include hypercoagulable states due to Factor V
molar excess of the two primers, all four deoxynucleo- Lieden or prothrombin mutations, hemochromatosis,
tide triphosphates (dNTPs), buffer, magnesium ion, galactosemia, and alpha-1-antitrypsin deficiency. Although
and a thermostable DNA polymerase. Successive rounds cystic fibrosis has a high degree of allelic heterogeneity
of heating to 93–95 C to denature the DNA, cooling to (over 1500 mutations identified), carrier screening is
50–60 C to allow annealing of the oligonucleotides, and typically done with a panel of 23–100 mutations, which
heating to 72 C (the temperature optimum for the detect approximately 90% of mutations in the target
DNA polymerase isolated from Thermus aquaticus) result population of Northern European Caucasians.
in synthesis of the DNA that lies between the two pri- The specific mutation detection methods can them-
mers. The amount of amplified DNA being synthesized selves be divided into those that utilize electrophoretic-
doubles (approximately) with every temperature cycle. or hybridization-based methods (Table 5.5). Both types
The amount of DNA produced is exponential with
respect to cycle number. After 30 cycles of denatur-
ation, annealing, extension, 230, or approximately 109, Table 5.5 Examples of Electrophoretic- and
copies of the DNA sequences lying between the two pri-
mers will have been generated. In a typical experiment
Hybridization-based Specific
starting with 20–100 ng of human DNA, 30 cycles of Mutation Detection Methods
amplification will produce enough DNA from a single
copy gene to be visualized on an ethidium bromide Electrophoretic Methods
stained gel. As each cycle takes 2–5 minutes, amplifica- Restriction enzyme Typically lab developed
digestion
tion of a specific sequence can easily be accomplished Allele-specific PCR Elucigene, Tepnel
in several hours. After amplification, the DNA can be Allele-specific primer ABI SNaPshot
analyzed by one of several techniques, depending on extension
the specific problem. PCR-oligonucleotide Cystic fibrosis V 3.0, Abbott/
ligation Celera
Hybridization Methods
Specific Versus Scanning Methods Allele-specific hybridization Resequencing arrays, Affymetrix
Allele-specific primer Tag-It, Luminex
Analytical methods in molecular genetics can be extension
grouped into two broad categories: mutation detection Ligation-PCR Golden Gate, Illumina
techniques, which are used to investigate the actual

104
Chapter 5 Basic Concepts in Human Molecular Genetics

of platforms are robust, and in experienced hands yield Interpretation of Molecular Testing Results
reproducible results. Both types of systems are in wide-
spread use in clinical and research laboratories. One Of the three types of coding region mutation caused
criterion for choice between these general platforms is by single nucleotide changes, two are often relatively
the cost incurred per sample analyzed. In the authors’ straightforward to interpret. It is generally assumed
experience, when the number of samples to be ana- that nonsense mutations (or indels giving rise to an
lyzed at one time (samples per batch) is low, electro- in-frame stop codon) are deleterious and are likely
phoretic methods are often the most cost effective to to be associated with a disease phenotype. Similarly,
develop, validate, and implement. However, when the silent mutations are most often assumed to be benign.
number of samples per batch is larger (greater than Exceptions exist, of course; silent mutations occurring
8–12 samples), then the hybridization-based techni- at the first or last bases of an exon may influence RNA
ques, many of which can be adapted to 96 well micro- splicing. In addition, silent mutations may interrupt an
plate formats or real-time, are often more cost effective. exonic splice enhancer, again leading to altered splic-
ing. An example of the disruption of an exonic splice
enhancer is found in spinal muscular atrophy (SMA).
Mutation Scanning Approaches The great majority of SMA is caused by deletion
Mutation scanning methods interrogate DNA frag- of exon 7 of the telomeric copy of the survival motor
ments for all sequence variants present. By definition, neuron gene (SNMt). There exists a very highly homol-
these strategies are not predicated on specificity for ogous gene, the centromeric copy (SMNc) that has only
specific alleles, but are designed for highly sensitive 5 nucleotide changes relative to SNMt. Why is the pres-
detection for all possible variants. In principle, all ence of this gene, which is structurally normal in
sequence variants present will be detected without almost all cases of SMA, not sufficient to prevent neu-
regard to advance knowledge of their pathogenic con- ronal death even if the telomeric copies are mutated?
sequences. Once evidence for a sequence variant is One of the nucleotide differences between SNMc and
found, the sample must be sequenced to determine SNMt is a C to T change in the centromeric copy.
its molecular nature. The advantage of using a scan- Although a silent mutation from an amino acid stand-
ning method followed by sequencing of only positive point (both sequences code for Valine), the T allele is
PCR products is that the scanning methods are typi- not recognized as an exonic splice enhancer. Thus, the
cally less costly to perform than DNA sequencing. SMNc gene transcript also lacks exon 7, and is unable
Although a number of mutation scanning methods to compensate for the lack of the SNMt gene.
have been developed, including single-strand confor- The interpretation of missense changes is challeng-
mation polymorphism (SSCP), heteroduplex analysis ing. Many examples (affecting many different genes)
(HA), conformation-specific gel electrophoresis (CSGE), exist in which missense changes are either pathogenic
thermal gradient gel electrophoresis (TGGE), and melt or benign. The distinction typically requires the exam-
curve analysis, they have been almost completely re- ination of multiple families carrying a given missense
placed by what is considered the gold standard mutation mutation, and/or functional studies of recombinant,
scanning method—DNA sequencing. There are a num- mutant protein. When a novel missense change is
ber of disease-associated genes that have high allelic encountered in a clinical laboratory setting, these stud-
heterogeneity, or very few recurrent mutations in the ies are not available. Thus, novel missense changes are
population, that are typically addressed for diagnostic typically referred to as variants of uncertain signifi-
purposes by whole gene sequencing, including BRCA1 cance (VUS).
and BRCA2, the mismatch repair genes; MSH2, MLH1, There are two schools of thought with respect to
and MSH6; CFTR (for diagnostic, nonscreening appli- how VUS should be reported. One school holds that
cations); biotinidase (BTD); and medium chain acylCoA unless the laboratory can give a clean interpretation
dehydrogenase (ACADM). Only when they are combined and offer documentation as to whether a given variant
with appropriate genetic data and in vitro functional is known to be pathogenic or benign, the report
studies can investigators distinguish disease-causing should simply indicate that a VUS was detected. Thus,
mutations from polymorphisms without clinical conse- the contribution of the genetic test to the manage-
quence. In the research laboratory, mutation screening ment of the patient is nil; it is as if the test were not
is a critical and obligatory final step toward identifying performed (and cannot be performed). Clearly, the
genes that underlie genetic disease. In the clinical labo- advantage in this approach is that one is not tempted
ratory, these methods are applied toward the detection to overinterpret the results, potentially leading to an
of mutations in diseases marked by significant allelic incorrect medical decision. The disadvantage is the
heterogeneity. As the number of laboratories offering frustration on the part of the patient (and healthcare
whole-gene sequencing assays for an increasing num- provider) that a rather expensive test (typically) has
ber of genes grows, the amount of variation in coding been performed and no useful information was
regions is beginning to be understood to be significantly obtained. The other school of thought holds that the
greater than previously thought. This has the conse- laboratory should use all the tools available and when
quence that obtaining a previously unknown sequence possible make a probabilistic statement as to the
variation in a patient sample is not uncommon. The potential effect of the variant. The advantage to this
interpretation of such results is challenging, and is not approach is that the final decision as to how the result
a solved problem. will be used in guiding patient care remains with the

105
Part II Concepts in Molecular Biology and Genetics

patient and his/her healthcare provider. The clear dis- database is to assist in making new alignments. For any
advantage is the possibility that the result provided given percentage of sequence homology selected, it is
may lead to incorrect medical management. Because possible to calculate a probability that any given amino
of this, interpretation of VUS should be done very acid will be substituted for another in normal, wild-type
carefully. proteins. For example, leucine to isoleucine changes
A number of tools to aid in the interpretation of are more common in nature than aspartic acid to trypto-
missense changes have been developed. As an increas- phan changes. It has been empirically determined that
ing number of species have had their complete at a cutoff of 62% identity, the BLOSUM62 matrix is
genome sequence determined, it is possible to use a the most useful in creating new alignments by calculat-
variety of sequence alignment tools to compare the ing the probability that an amino acid in a new protein
amino acid found at a particular location in the aligns with a given position in one or more other pro-
human gene to that found in multiple other species. teins. BLOSUM62 has also been used as a tool for classi-
The rationale for this is the notion that if an amino fying VUS as either deleterious or benign. Amino acid
acid is invariant across species, it is more likely to be changes that are frequently found are judged to be more
important for protein function, and a missense change likely to be benign, whereas those that are infrequent are
at a highly conserved residue is more likely to be path- more likely to be pathogenic. Note that this method
ogenic. On the other hand, if a given amino acid posi- relies only on the global probability of one amino acid
tion is poorly conserved, a missense change may be being substituted for another and does not utilize any
more likely to be benign. gene-specific information. The BLOSUM62 matrix is
Several groups have developed algorithms quantify- freely available at http://www.ncbi.nlm.nih.gov/Class/
ing the probability based on sequence conservation BLAST/BLOSUM62.txt.
that a given missense change is pathogenic or benign. In addition to sequence conservation and global sub-
Two of these tools, SIFT and POLYPHEN, are freely stitution probabilities, the chemical characteristics of the
available online at http://blocks.fhcrc.org/sift/SIFT amino acids have been used to characterize missense
_related_seqs_submit.html and http://genetics.bwh. changes. A composite of three quantifiable chemical
harvard.edu/pph/, respectively. These tools are useful properties—composition, polarity, and molecular vol-
but are far from perfect. For example, one study evalu- ume—has been defined and is termed the Grantham
ated both of these online tools against sequence varia- score. When a substitution results in a large change in
tions that were known to be either benign or the Grantham score, reflecting a large change in the
pathogenic in two well-characterized genes, beta-globin chemical nature of the amino acid, the change is more
and G6PD, as well as two others, TNFRSF1A (tumor likely to be deleterious. In contrast, a small change in
necrosis factor receptor-associated periodic syndrome) the Grantham score indicates that the chemical nature
and the MEFV gene (familial Mediterranean fever). of the residue has not changed appreciably, is less likely
The two programs were found to be between 70% to alter protein function, and is more likely benign. Sim-
and 80% sensitive and specific [2]. ilar to the BLOSUM62 method, this scoring system refers
One mechanism by which sequence conservation to a global standard and does not take the sequence of
strategies can be foiled occurs when a pathogenic the particular gene into account. However, it is possible
change results in the substitution for an amino acid that to combine the Grantham score method with sequence
is the normal sequence in another species. For exam- alignment tools. In this approach, the difference in the
ple, the most common medium chain acyl co-A dehy- Grantham score across the normal protein alignment is
drogenase (MCAD) deficiency mutation (accounting made. That is, after the construction of an alignment
for approximately 70–80% of mutant alleles) is the of normal orthologs (homologous proteins from differ-
lysine to glutamic acid change at codon 329 (K329E). ent species), the difference between the minimum and
Sequence alignment-based tools such as SIFT or POLY- maximum Grantham score for the residue under investi-
PHEN do not identify this as a pathogenic change gation is calculated. This number is termed the Gran-
because the corresponding codon in the mouse is nor- tham distance (the distance in Grantham space
mally glutamic acid. between the amino acids at the codon in question across
Another widely used tool is the BLOSUM62 species). Then, the difference in the Grantham score for
matrix. This matrix is derived from the BLOCKS that codon in the wild-type and mutant protein is calcu-
database (http://bioinformatics.weizmann.ac.il/blocks/ lated. This number is termed the Grantham variation.
blocks_release.html). This database consists of align- If the Grantham variation is larger than the Grantham
ments of peptides from many proteins from many distance, then the change is more likely to be patho-
different species. One use of the database is to under- genic. However, if the VUS results in an amino acid sub-
stand how certain motifs (nucleotide binding clefts, leu- stitution that gives a change in Grantham score that is
cine zippers, helix-turn-helix, and others) are conserved smaller than that seen across species, the change is more
and frequently utilized in many different proteins. One likely to be benign. One online tool for these calcula-
can select groups of peptides based on the amino acid tions (Align GDGV) is freely available at http://agvgd.
sequence alignment. For example, one may wish to study iarc.fr/agvgd_input.php.
only very well conserved motifs and use only groups of All of the strategies discussed in the preceding para-
peptides that are 90% identical. Or, one may wish to graphs are helpful in attempting to understand the clin-
study more distantly related sequences and choose ical significance of VUSs, but none of them are perfect—
groups that are only 40% identical. One use of this far from it. The sensitivity and specificity for all of them

106
Chapter 5 Basic Concepts in Human Molecular Genetics

seem to be in the 70–80% range. However, since they the $1,000 genome project, promises to greatly
query different properties of the gene and of amino acid increase the reach and scope of molecular genetics.
substitution, it is possible that, when used together, the Indeed, some subspecialties, such as biochemical and
quality of the results may be improved. One study inves- cytogenetics may ultimately merge with molecular
tigated the use of all for the preceding methods as genetics and offer the medical community a more
applied to five different genes, The authors found that comprehensive and integrated approach to under-
when the results of all 4 methods agreed, the final calls standing the role of our genomic variation in health
had a predictive value of approximately 88%. Further, and disease. However, the interpretations of results
they noted that mutations at residues that were from the clinical molecular genetics laboratory will
completely conserved across species had a 92–97% prob- always be rooted in the fundamentals of molecular
ability of being deleterious [3]. Another group of in- and cell biology and in the central paradigm—that
vestigators have incorporated the results of these genes encode proteins. It will be from these roots that
characterization tools and combined them with classical modern, personalized medicine will grow.
pedigree and linkage analysis in a comprehensive Bayes-
ian approach. Using this method, this group has been
able to classify several missense mutations in the BRCA1 REFERENCES
gene as either pathogenic or benign [4].
The in-silico characterization of missense VUS 1. Bennett RL, Steinhaus KA, Uhrich SB, et al. Recommendations
changes is still in its infancy, and much more work for standardized human pedigree nomenclature. Pedigree Stan-
dardization Task Force of the National Society of Genetic Counse-
needs to be done in this area. As the era of whole lors. Am J Hum Genetics. 1995;56:745–752.
genome sequencing rapidly approaches, the urgency 2. Tchernitchko D, Goossens M, Wajcman H, et al. In silico predic-
of the need to characterize novel changes is increasing. tion of the deleterious effect of a mutation: Proceed with caution
in clinical genetics. Clin Chem. 2004;50:1974–1978.
3. Chan PA, Duraisamy S, Miller PJ, et al. Interpreting missense var-
CONCLUSION iants: Comparing computational methods in human disease
genes CDKN2A, MLH1, MSH2, MECP2, and tyrosinase (TYR).
Molecular genetics utilizes the laboratory tools of Hum Mutat. 2007;28:683–693.
molecular biology to relate changes in the structure 4. Goldgar DE, Easton DF, Deffenbaugh AM, et al. Breast Cancer
Information Core (BIC) Steering Committee. Integrated evalua-
and sequence of human genes to functional changes tion of DNA sequence variants of unknown clinical significance:
in protein function, and ultimately to health and dis- Application to BRCA1 and BRCA2. Am J Hum Genetics. 2004;
ease. New technology, such as is being developed for 75:535–544.

107

Das könnte Ihnen auch gefallen