Beruflich Dokumente
Kultur Dokumente
Indels
coming
soon!
(M2)
+
some
post-processing
to
rescue
TiN
variants
and
eliminate
ar<facts
Reminder:
MuTect
itself
applies
several
internal
lters
detection of a somatic
Tect. MuTect takes as Variant filters (site-based)
quencing data from Proximal
gap
Tumor Normal
Panel of normal
es and, after removing samples
ementary Methods),
Strand
e is evidence for a variant
b ias
Read filters
Proximal gap Strand bias
dom sequencing errors.
re then passed Poor
six mapping
STD callset
through
HC callset
L[Mfm]P(m,f) ?
log10 log10 T
s (Table 1). Next, a panel
) filter is used to Triallelic
site
screen L[M0](1P(m,f))
Poor mapping Triallelic site
Variant detection statistic
ives caused by rare error
Clustered
posi.on
n additional samples.
T
Observed
in
normal
sing the matched normal
HC, high confidence.
Clustered
position
Observed
in control
Liquid
tumors
Blood-borne
cancer
(e.g.
leukemia)
Tissue-adjacent
normal
Esp.
tumors
that
are
spread
thin
(unlike
clearly
separated,
below)
Prevalence
of
TiN
variants
by
tumor
type
47
6
155
59
77
254
298
49
Greater
than
2%
tumor
in
normal
Less
than
2%
tumor
in
normal
*
Tissue
adjacent
normal
^
Skin
used
as
normal
1
Tumor in Normal
0.5
0
BRCA LAML CLL HNSC
1.
Comprehensive
molecular
portraits
of
human
breast
tumors.
Nature.
490
(7418):61-70
2.
Genomic
and
epigenomic
landscapes
of
adult
de
novo
acute
myeloid
leukemia.
NEJM.
368:2059-2074
3.
Comprehensive
Genomic
Characteriza.on
of
Head
and
Neck
Squamous
Cell
Carcinomas
Nature.
517
(7536):576-582
MuTect
evaluates
AF
of
tumor
vs.
normal
Kept
Detected
Normal AF
Rejected
Undetected
Tumor
Matched Tumor MuTect
Tumor AF
x
x
x
xx
Detected
Undetected
Normal AF
Contaminated Normal
x Tumor AF
x
AF
evalua.on
leads
to
rejec.on
if
Normal
is
contaminated
Kept
Detected
Normal AF
Rejected
Undetected
Tumor
Matched Tumor MuTect
Tumor AF
x
x
x
xx
Kept
Detected
Rejected
Undetected
Normal AF
MuTect
Contaminated
Normal
Contaminated Normal
x Tumor AF
x
Muta<on TiN t
Recover muta.ons if at least 100x more likely to be soma.c than germline given es.mated TiN
0.9
0.8
0.7
Sensitivity
0.6
0.5
0.4
0.3
deTiN af>Q3
No deTiN af>Q3
0.2 deTiN all
No deTiN
deTiN all
af>Q3
0.1 No deTiN af>Q3 Muta.on
deTiN all allele
frac.on
No deTiN all
0.2 in
top
quar.le
0
0.005 0.01 0.02 0.05 0.07 0.1
Tumor in Normal
With
TiN
rescue,
90%
sensi.vity
is
recovered
0.9
0.8
0.7
Sensitivity
0.6
0.5
0.4
deTiN af>Q3
0.3 No deTiN af>Q3
deTiN af>Q3
deTiN all
No deTiN af>Q3
No deTiN all
deTiN
0.2 deTiN af>Q3
all
No deTiN
deTiN af>Q3
af>Q3
No deTiN all
0.1 deTiN all
No deTiN af>Q3
No deTiN
deTiN all all
0 No deTiN all
0.005 0.01 0.02 0.05 0.07 0.1 0.2
Tumor in Normal
How
does
this
impact
recovery
of
driver
muta.ons?
samples
TiN
rescue
recovers
~40%
more
puta.ve
driver
muta.ons
samples
Vaida.on
experiments
show
recovery
of
expected
distribu.ons
p=.0007
6
4
Driver per Sample
A[er
deTiN
muta<on
recovery,
contaminated
and
uncontaminated
samples
had
the
same
distribu<on
of
driver
muta<ons
per
sample
PART
2:
REMOVING
ARTIFACTS
Types
of
FP
ar.facts
muta.on
rate
(per
million
sites)
3-base context histogram in 96 bins (reverse complement combined)
Manifesta.on
of
ar.facts
in
muta.on
signatures
(Lego
plots)
typical C>T
aging signature
interesting
A>T signature
Example
of
site
where
a
T>A
ar.fact
appears
to
be
a
real
variant
Example T>A
mutation:
splice-site
CHD8
Tumor 7 reads support T>A
splice-site mutation in
7 reads from CHD8
tumor
0 normal
consistent
with somatic
Normal
mutation at no support for T>A
mutation from normal
15% allele sample
fraction
Mutation is
entirely
supported by
reads from one
Tumor All 7 reads are on F1R2
strand (F1R2 strand (red). p~0.008
marked in red,
F2R1 marked in
blue)
Normal
Use
strand
orienta.on
to
lter
out
the
likely
ar.facts
T>A & A>T mutations in problematic sample non-T>A / A>T mutations (null model)
alternate allele count
for non-T>A
Filter cut
line for
FDR < 1%
F_orientation_bias
F_orientation_bias
OxoG
oxida.on
causes
G>T
ar.facts
and
C>A
ar.facts
top bottom
3) PCR denaturation
5 g g 3 5 g g 3
T T T T
5 3 5 3
A A A A
read 2
read 2
G>T and C>A candidate muta.ons Other candidate muta.ons (null model)
Cut line
THCA data (n=402 tumors) Lego plots
Lee Lichtenstein, Chip Stewart, Trevor Pugh, Dan-avi Landau, Tim Fennel, George Grant
Filter
also
works
on
ar.facts
caused
by
FFPE
oxida.on
C>T
& G>A
C>T / G>A
Artifact
mutations
F_strand_bias
other than
AFTER
C>T
& G>A
F_strand_bias
Soma.c
Variant
Discovery
Workow
Indels
coming
soon!
(M2)
+
some
post-processing
to
rescue
TiN
variants
and
eliminate
ar<facts
talks
Further
reading
Documenta.on
coming
soon
to
the
GATK
website
In
the
mean.me,
see
hop://www.broadins.tute.org/cancer/cga/Home