Beruflich Dokumente
Kultur Dokumente
and interpretation
Handbook for Clinicians
1st Edition
________
Vinod Scaria
Sridhar Sivasubbu
Like us on Facebook
https://www.facebook.com/clinicalexome
This work is licensed under a Creative Commons AttributionShareAlike 4.0 International License.
Acknowledgements
A number of individuals have contributed to this book
in personal as well as professional capacities. This include
graduate students from our groups, especially Mr.
Shamsudheen Karuthedath Vellarikkal, Mr. Rijith Jayarajan,
Mr. Ankit Verma, Ms. Saakshi Jalali, Ms. Heena Dhiman and
Mr. Kandarp Joshi who have helped in collating content, and
figures which enrich the manuscript. Authors also thank and
acknowledge critical comments, editorial help and support
from our colleagues, Dr. Vamsi Krishna, Dr. Adita Joshi, Dr.
Srinivasan Ramachandran, Dr. Jameel Ahmad Khan and Dr.
Abhay Sharma.
Authors thank the Genomics for Understanding Rare
Diseases- India Alliance network (GUaRDiAN) and
collaborators for critical insights, which significantly enriched
the outlook and content of this book. Authors thank an
innumerable number of patients and families who have
interacted with us through the network, without which our
insights and knowledge would have been limited.
The authors acknowledge the financial support from
the Council of Scientific and Industrial Research, (CSIR), India
through grant BSC0212 (Wellness Genomics Project). The
funding agencies had no role in the preparation of the
content or the decision in publishing this book. Authors
declare no competing financial interests.
Dedication
Dedicated to the innumerable patients and families who
enriched our knowledge and insight through their close
interactions, shared their distress like a family member,
contributed samples to research selflessly, without which we
would not have been what we are, and we would not be doing
what we do, and would not be writing what we wrote.
Contents
Contents .................................................................................................9
Foreword ............................................................................................... 11
Case of the Bhai .................................................................................... 13
The human genome project and how it changed everything .............. 19
Genome variations and how they makes us different? ........................ 29
A brief introduction to next generation sequencing ........................... 37
When you could sequence your own genomes .................................... 43
So what if we could sequence just the protein coding genome? .........49
When should you do exome sequencing? ............................................ 55
When should you probably not do exome sequencing? ...................... 61
First things first: putting insights before data ..................................... 65
Educating the patient and getting an informed consent..................... 71
Points to note when you outsource exome sequencing ...................... 81
Understanding the steps in analysis of exome sequence data ............ 85
How good is the exome sequencing data? ........................................... 91
Prioritizing, annotating and interpreting variants .............................. 95
Don't forget the validation ................................................................ 103
Ethical considerations in whole exome sequencing ........................... 107
The last word ...................................................................................... 113
Index.................................................................................................... 115
10
Foreword
I would easily pick Next Generation Sequencing as one of
the techniques that had an immediate and immense application
in research and healthcare. Within a span of five years, almost
every scientist and physician cannot afford ignorance of exome
sequencing. With newspapers and internet screaming genome
everywhere, this handbook by Dr. Scaria and Dr. Sivasubbu is
timely.
The introductory chapter on Bhai is a story of exome
sequencing that is lucidly told even to general public. It is really
important for everybody to know, let alone clinicians what
sequencing is and how human genome project has improved
our understanding of role of genetic variants in health and
disease. The authors then introduce readers to exome, clinical
importance of sequencing it and the situations where this is
helpful in patient care. At the same time they warn the
physicians not to get carried away. In the next chapter they
explain the basics of medical evaluation and how they remain
evergreen even in the current era.
It is important the patient is not taken for a ride by the new
diagnostic companies which did not exist the previous year.
Both clinician and the patient must be aware of what they are
doing with the new test and what they can expect in the form
of results. Probably both need to be involved thoroughly in the
consenting process.
For a researcher, the authors explain how outsourcing is not
easy despite having several service providers and detail in
simple terms how the large data can be analyzed. Chapters on
quality control and interpretation of variants serve the readers
to understand the intricacies of this technique. Independent
validation of the results is vital to apply this technique in clinical
11
Girisha KM
Professor and Head
Department of Medical Genetics
Kasturba Medical College, Manipal
Manipal University
12
Chapter 1
13
14
15
16
17
18
Chapter 2
20
21
22
24
The
first
chromosome to be
sequenced
was
chromosome 22, one
of
the
smallest
chromosomes in the
human genome. The
chromosome
sequence
was
published in the year
1999.
In March 2000,
the
draft
human
genome
was
announced by the
then US President Bill
Clinton jointly with
the British Prime
Minister Tony Blair.
The
papers
corresponding to the
publicly
funded
genome
and
the
Celera assembly were
published in the journals Nature and Science
respectively. Further improvements of the drafts were
announced in the year 2003.
The Human Genome Project was unique in many
ways. In one way, it was a mega-project that involved a
large number of researchers, not only from the United
States of America, who led the project, but also from
other countries across the globe, majorly from Britain,
25
26
27
28
Chapter 3
29
30
13
32
33
15
Welter, Danielle, et al. "The NHGRI GWAS Catalog, a curated resource
of SNP-trait associations." Nucleic acids research 42.D1 (2014): D1001-D1006.
34
16
35
36
Chapter 4
37
40
17
41
42
Chapter 5
The revolution
in
technological
advancements and the
resultant scale and
throughput was phenomenal, so much that at one point,
the speed at which the sequencing technology improved
in terms of throughput and cost -reduction was
comparable to the Moores law in the case of
43
44
46
47
48
Chapter 6
51
52
53
54
Chapter 7
55
Figure 1. The quadrant where the optimum use of whole exome and
genome sequencing is recommended.
59
60
Chapter 8
63
64
Chapter 9
66
67
68
69
70
Chapter 10
71
a. The availability of
the
sequence
could put one in
precarious
situations
including
identification of
an
individual,
inference
of
paternity,
The paper describing the
inference
of
dataset was published with
specific features
the following citation:
of the genealogy
and
possible
Source Code for Biology and Medicine
2013, 8:13 doi:10.1186/1751-0473-8-13
prediction
of
http://www.scfbm.org/content/8/1/13
risks to self and
children, and in
some times to
other
close
relatives in the family.
72
21
"Personal genomes, participatory genomics and the anonymityprivacy conundrum." Journal of Genetics (in press) available at URL:
http://link.springer.com/article/10.1007/s12041-014-0451-3
73
74
Son/daughter/wife
of.aged..
Residing at
(Date)
Certified that the above consent has been signed in my presence. The
purpose for which the sample will be used has been explained to the above
volunteer. The individual is free to withdraw from the study as and when
he/she feels so inclined.
(Date)
75
Exclusion Form
I choose to exclude the following information from the questionnaire with
respect to
analysis or public disclosure (please indicate the rave/ant question numbers
from the
attached questionnaire)
1. Analysis.
2. Public disclosure:
INFORMATION FOR THE VOLUNTEERS
1.Purpose of study
The principle scientific goal of this study is to explore avenues to study
genetic variability between Individuals and to correlate the variability to the
phenotypes. The data generated (i.e., human DNA sequence, medical
information and physical traits) may be used for scientific and clinical
research such as development of computational tools and interfaces for
scientist, clinicians and individuals in addition to developing general public
awareness on potential benefits and risks of having whole genome level
information available to the public.
2. Enrolment procedures
A. Collection of baseline trait data:
You are required to provide baseline trait data about yourself,
including: data of birth, medications, allergies, vaccines, personal
and family medical history, race/ethnicity/ancestry and vital signs
(e.g. height, weight, blood pressure etc) in the attached
questionnaire.
B. Monozygotic twin:
If you have any identical twin(s), such sibling(s) will need to
provide consent for your participation in this research.
3. Tissue (Blood/Saliva) collection
A. Blood sample will be collected from the upper arm by
Venipuncture. Twenty-five ml of blood sample will drawn by an
authorized medical or an authorized technician under the
supervision of an authorized medical doctor, in the presence of
the principal investigator. Fresh blood sample will be collected in
designated containers (which will be provided by CSIR/IGlB).
Serum would be isolated from the collected blood sample for
biochemical analysis
B. Saliva sample will be collected by voluntary spitting. Two to
76
77
time.
ii) Anyone with sufficient knowledge and resources could take
your DNA sequence data and or your personal trait information
and utilize the data, with or without modification, to (1) infer
paternity or other features of your genealogy, (2) reveal the
possibility of a disease or risk for a disease. Such information could
lead to social and financial consequences including but not limited
to employment and insurance.
iii) Your family members could also be subject to discrimination for
employment, insurance or financial service on the basis of the
public disclosure of your genetic and trait information.
iv) If you have previously made or plan to make available genetic
information In a confidential setting, the data provided by you as
part of this study may reveal your identity.
v) Any conclusions derived from the publicly available information
may be speculative with rasped to you and even less predictive
with respect to your family members. The complete set of risks
posed to you and your family members due to the public release
of the DNA sequence and trait data is not known at this time. We
encourage you to discuss this aspect with your family members.
7. Benefits
(i). At present there are no proven benefits to you for your participation in
this study.
(ii). This study may benefit the medical and research community in
particular, and humanity in general and may help in establishing genetic
causes and predisposition for common diseases.
(iii). You may experience satisfaction from participating in research that
may benefit medical science.
78
79
80
Chapter 11
22
VCF stands for Variant Call Format. This format came into existence
after the 1000 Genomes project and is widely used in the community. A
number of bioinformatics tools and resources for analyzing variant data
take variant data input as VCF files.
83
84
Chapter 12
23
86
Figure 2. The FASTQ file format with sequences of the reads and
qualities of bases in the sequence read.
88
89
90
Chapter 13
91
Figure 2. Quality plots for bad quality reads. Note the low quality of
reads towards the end.
93
94
Chapter 14
96
100
101
102
Chapter 15
105
106
Chapter 16
specific regions or loci variations that might have nontrivial implications. The consent should include a section
where the patient or family members could explicitly
state this.
Anonymity and privacy
Utmost care on anonymity and privacy is another
important component of ethical conduct to the patient
and family. It should be emphasized that anonymity and
privacy are not two sides of the same coin, but are
separate entities. A detailed discussion with the patient
and family members is essential on this aspect. In many
cases, the impact of the genetic testing is just not limited
to the index case or family, but might have implications
in the genetic predisposition and disease manifestation
in the other family members too. Similarly, the
identification of a mutation might not be relevant to the
specific individual or family, but could be of relevance in
terms of screening and carrier detection in other
members of the family. As in the case of Bhai, the
identification of a novel mutation in KRT5 gene would
have implications in genetic screening and in some cases
prenatal screening with implications for the other
members of the family. In some cases the validation of
the genetic variant would require participation of other
members of the family, including people who might not
be affected with the disease.
With the advent of Internet support groups and
patient groups, in many cases the patient of the family
members do not like to be anonymous, since it might
benefit the larger community and society. In some cases,
the patient and family would like to remain anonymous
given the social stigma associated with the disease and
111
112
113
114
Index
computational 16, 26, 46, 53,
88, 97, 121
454 38
computer 40, 53
coverage 15, 82, 83, 88, 91, 92,
93
Albinism 31
alignment 52, 83, 85, 88, 89, 91,
92, 93
Anonymity 73, 111
autosomal 15, 67
123
disease 11, 13, 15, 16, 34, 55,
Beijing 44, 45
95, 111
Bill Clinton 25
50
capillary 20, 22
115
Koreans 45
leukemia 53
fluorophores 20
51, 121
microelectronics 33, 37
GWAS 34
microprocessor 33
microsatellite 35
molecular 16, 35, 55, 56, 58, 119
molecular biology 35
mutation 14, 63, 109, 111
Helicos 41
heterozygous 15, 67, 88, 96
homozygosity 67, 96, 104
Nanopore 41
next generation sequencing 9,
non-synonymous 97
inherit 30, 31
Inheritance 100
inversions 53
Ion Torrent 41, 42
116
123
Shankar Balasubramanian 39
shotgun 24
SIFT 97, 98
silicon 41
Solexa 39
Pacific Biosciences 41
SOLiD 38
Sri Lankan 45
polymerase 32, 49
PolyPhen2 97, 98
Tony Blair 25
pyrophosphate 38
translocations 53
trimming 87
R
U
Russian 45
Venter 31, 44
Watson 44
117
118
119
120
121
122
123
Let us know what you have to say about this book on our
Facebook page:
https://www.facebook.com/clinicalexome
124