Beruflich Dokumente
Kultur Dokumente
Outline
I
Parallel computing
I
I
Parallel computing
GPUs
HMM Parallelisation
I
I
I
I
I
Human genomes
Impact on Statistics
I
I
I
I
such that the future St+1 , ..., ST does not depend on the past
S1 , ..., St1 , given the present St
s1
s2
s3
sT 1
sT
y1
y2
y3
yT 1
yT
T
Y
t=2
the Viterbi algorithm finds the most probable (MAP) state sequence,
maximising s = arg maxs Pr (s1..T |y1..T )
y2
y3
s1=1
s2=1
s3=1
s1=2
s2=2
s3=2
s1=3
s2=3
s3=3
Parallel computing
Modern transistor manufacturing has recently reached a power ceiling
that limits the computational power of individual processing units.
The future of processing power increase relies on the ability to use
multiple computational units in parallel.
Current technology permits parallel computation using
I
GPUs are low cost, low energy, highly parallel, local devices designed for
Single Instruction, Multiple Data (SIMD) processing. They have a highly
specialized architecture (with a bespoke language - CUDA) that allows
for great performance in certain situations.
With linear algorithm flow, reasonable suitable space requirements and
efficient memory use, implementations of suitable algorithms may achieve
as high as 30 acceleration compared to multi processors and even more
than 100 speedup compared to sequential implementations (Lee et al
2010).
the standard HMM algorithms all repeat the same operations for
each state st . Moreover, the calculation of the (st ) (or beta, etc.)
values are independent and hence the calculations are suitable for
parallelisation on GPUs.
y2
y2y2
y3
y3y3
s1=1
s2=1
s3=1
s2=2
s2=2
s3=2
s3=3
s2=3
s3=3
the starting (e.g. (skTb )) values are not available for each block at
the beginning of each recursion
hence run the algorithm N-times, from each possible state, each
time conditioning on a different starting state. Naturally these sets
of N conditional runs may be also run in parallel.
when the computation is done for all blocks, they can be merged
sequentially updating each conditional run with its corresponding
starting value
y1
y2
y3
y4
y5
y6
y7
y8
y9
s1=1
s2=1
s3=1
s4=1
s5=1
s6=1
s7=1
s8=1
s9=1
s1=2
s2=2
s3=2
s4=2
s5=2
s6=2
s7=2
s8=2
s9=2
y1
y2
y3
y4
y5
y6
y7
y8
y9
s1=1
s2=1
s3=1
s4=1
s5=1
s6=1
s7=1
s8=1
s9=1
s1=2
s2=2
s3=2
s4=2
s5=2
s6=2
s7=2
s8=2
s9=2
y1
y2
y3
y4
y5
y6
y7
y8
y9
s1=1
s2=1
s3=1
s4=1
s5=1
s6=1
s7=1
s8=1
s9=1
s1=2
s2=2
s3=2
s4=2
s5=2
s6=2
s7=2
s8=2
s9=2
y1
y2
y3
y4
y5
y6
y7
y8
y9
s1=1
s2=1
s3=1
s4=1
s5=1
s6=1
s7=1
s8=1
s9=1
s1=2
s2=2
s3=2
s4=2
s5=2
s6=2
s7=2
s8=2
s9=2
y1
y2
y3
y4
y5
y6
y7
y8
y9
s1=1
s2=1
s3=1
s4=1
s5=1
s6=1
s7=1
s8=1
s9=1
s1=2
s2=2
s3=2
s4=2
s5=2
s6=2
s7=2
s8=2
s9=2
y1
y2
y3
y4
y5
y6
y7
y8
y9
s1=1
s2=1
s3=1
s4=1
s5=1
s6=1
s7=1
s8=1
s9=1
s1=2
s2=2
s3=2
s4=2
s5=2
s6=2
s7=2
s8=2
s9=2
Speed Up
Just for comparison, state of the art GPUs have 2688 cores and the
supercomputer Titan (National Oak Ridge) has 18,688 Nvidia Tesla
K20X GPUs with a total of 50million cores. Emerald has 372 Tesla
cards. Present datasets in population genetics typically have K and
N in the thousands and T in the millions.
Example
GPU-LSM
Indeed one would expect that across the groups the posterior probability is very
low. Hence
weHence
would we
expect
to see
a block
likeastructure
the pairwise
posterior
low.
would
expect
to see
block likeinstructure
in the
pairwise posterior
Adaptive Mutation
Neutral
mutationmutation
and very low
probability
everywhereeverywhere
else.
and
very low probability
else.
Neutrally Neutrally
Evolving Population
Naturally Selected
Population
Evolving Population
Naturally
Selected Population
Phylogenetic Tree
LSM Posterior
(a)
(a)
(b)
(b)
Figure
2.6:
Comparison
between Kingmans
Figure 2.6:
Comparison
between Kingmans
n-Coalescentn-Coalescent
and Li and and Li and
Stephens Distributional
Representation
Stephens Distributional
Representation
Thus it
would
seem
to detect
signals ofweselection
we should
Thus it would
seem
trivial;
to trivial;
detect signals
of selection
should simply
run simply run
Li and Stephens
and find
regions
that
look
like There
the above.
There are two
Li and Stephens
HMM andHMM
find regions
that
look like
the
above.
are two
this approach,
we could
observe
these by
structures
by chance and
problems problems
with this with
approach,
we could observe
these
structures
chance and
some demographic
eects
the same coalescent
two theretwo
are there
some are
demographic
eects which
give which
rise togive
the rise
sametocoalescent
structure.structure.
be definition
results of
in chromosome
regions of chromosome
that
are more
Selection Selection
be definition
results in regions
that are more
similar
than similar than
be expected
by chance.
Theaspect
chance
aspect
of the
previous statement
would bewould
expected
by chance.
The chance
of the
previous
statement
depends
on the recombination
and the
by LD
virtue
the LD There
structure.
There
may be a
depends on
the recombination
and by virtue
structure.
may be
a
region
hasselection
a strongpressure,
selectionbut
pressure,
the recombination
rate is
region that
has athat
strong
becausebut
thebecause
recombination
rate is
tMRCA
t24 = t14
t12
t34
1
Applications
Conclusions
Medical genetics and genomics will produce vast data sets over the
next few years
We need statistical methods that can scale to handle
I
I
I
dimension
heterogeneity
model misspecification