Sie sind auf Seite 1von 1
Tools For Intensive Numerical Computation GPU – Graphic Processing Unit Jogo, Pedro; Oliveira, Ana; Silva, Rafaela;

Tools For Intensive Numerical Computation

GPU Graphic Processing Unit

Jogo, Pedro; Oliveira, Ana; Silva, Rafaela;

Tools For Intensive Numerical Computation GPU – Graphic Processing Unit Jogo, Pedro; Oliveira, Ana; Silva, Rafaela;

Departamento de Engenharia Electrotécnica e Computadores/Instituto Superior Técnico

Introduction In recent decades the incidence of some diseases, such as cancer, where diagnosis has a
Introduction
In recent decades the incidence of some diseases, such as cancer, where
diagnosis has a crucial role, has been increasing at an alarming rate. As a
consequence, the development of quick and accurate imaging techniques is not
only an interest of the producers but also a society need. Advances have been
made in this field and now it is known that the use of a GPU to help the image
processing can not only make it faster but also more accurate.
Figure 2 – Incidence of cancer in UK
Figure 1 – GeForce 6600 GT (NV43) GPU
How does GPU work? Figure 3 – Compared to the CPU, the GPU Devotes More Transistors
How does GPU work?
Figure 3 – Compared to the CPU, the GPU Devotes More Transistors to Data Processing. The
GPU is especially well-suited to address problems that can be expressed as data-parallel
computations – the same program is executed on many data elements in parallel – with high
arithmetic intensity – the ratio of arithmetic operations to memory operations.
Figure 5 – Floating-
Point Operations
per Second for the
CPU and GPU
Figure 6 – General
structure of a GPU
Figure 4 – Serial code executes
on the host while parallel code
executes on the device.
Literature Cited Ansorge, Richard E.: AIRWC: Accelerated Image Registration With CUDA. Cavendish Laboratory of University of
Literature Cited
Ansorge, Richard E.: AIRWC: Accelerated Image Registration With CUDA. Cavendish Laboratory of
University of Cambridge. August 2008
Blas, Andrea Di; Lakdewey, Tim: Data Monster. September 2009
Luebke, David; Humphreys, Greg: How GPUs Work. NVIDIA Research and University of Virginia,
2007
NVIDIA, CUDA Programming Guide, Version 2.3.1, August 2009
Cancer Research UK, Trends In UK Cancer Incidence Statistics, June 2009
Materials and Method In order to accelerate image registration, normally done with a CPU, the same
Materials and Method
In order to accelerate image registration, normally done with a CPU, the same was tried with a
GPU. Image registration consists on comparing or integrating data from different measurements.
It is important mention that this acceleration was tried with CUDA (Compute Unifed Device
Architecture). The 3D data sets are “sliced” and then those obtained images suffer registration.
The algorithm was, obviously, translated to GPU “language”.
A
Algorithm that
B
seeks to find an
affine
transformation
that maps a
“source”
volume onto a
Figure 7 – NVIDIA 9600 GX
Figure 8 – “Brain extraction”
“target”
9
from
a
MRI
from two
volume
different subjects
Figure
Parameter affine
registration of A
to B
The CUDA Kernel (responsible to connect the application software to
the hardware of a computer) runs asynchronously on the GPU and
executes many data-parallel threads simultaneously. Here each
thread processes all the z values for a single x-y position in the source
volume. The target dataset is stored in device texture memory and
the affine transformation is stored in device constant memory.
The Host Code is responsible for loading the source and target volumes
to the device, which was only needed in the beginning.
Figure 10 – Illustration of
the
kernel
function
and
its
interaction
with
diverse hardware
Figure
11
Slice
Figure
13
Slice
Figure
14
Slice
Figure
12
Slice
from
Source
from Source Image
from Source Image
from
Target
Image
after
registration
after
registration
Image
on GPU
on CPU
Results The results are illustrated in the following table: Experiment CPU GPU Speed Up 12 Parameter
Results
The results are illustrated in the following table:
Experiment
CPU
GPU
Speed Up
12 Parameter
Affine
Registration
8.5 minutes
6 seconds
98%
6 Parameter
Registration
270 seconds
2.39 seconds
99%
Conclusion It is clear that if this technique was applied to each diagnosis method where image
Conclusion
It is clear that if this technique was applied to each diagnosis method where image
registration has a significant role, it would become not only faster but also better. This
could, consequently, lead to a previous diagnosis and reduce the increase of diseases
in which diagnosis has a crucial role.
Acknowledgements We would like to thank to Rodrigo Ventura for all the provided information.
Acknowledgements
We would like to thank to Rodrigo Ventura for all the provided information.
Further Information For further information, please contact pedrojogo@ist.utl.pt, ana.r.de.oliveira@ist.utl.pt or rafaela.sepulveda@ist.utl.pt. An online PDF version is
Further Information
For
further
information,
please
contact
pedrojogo@ist.utl.pt,
ana.r.de.oliveira@ist.utl.pt
or
rafaela.sepulveda@ist.utl.pt.
An
online
PDF
version
is
available
at
http://www.scribd.com/doc/24094085/.