Sie sind auf Seite 1von 14

KWARA STATE UNIVERSITY, MALETE

ADVANCE TOPICS IN COMPUTER SCIENCE (CSC814)

TERM PAPER

ON

APPLICATION OF RESIDUE NUMBER SYSTEM

TO BIOINFORMATICS

BY

AFEEZ ADESHINA OKE

18/27/MCS015

LECTURER IN CHARGE: Dr. (Mrs.) R.O. Babatunde

1
Table of Content

1. Title Page…………………………………………………………….1

2. Table of Content……………………………………………………..2

3. Abstract……………………….……………………………………...3

4. Introduction……………………………………………………………3

5. Literature Review…………………………… …………………........5-6

6. Residue Number System………………………………………………6

6.1. Computations in RNS....................................................................................6

6.2. Data Conversion in RNS?...............................................................................7

6.3. Smith Waterman Algorithm…………….......................................................8-9

7. Proposed Approach using RNS-SWA………………………………9 - 11

8. Conclusion…………………………………………………………….15

9. References…………………………………………………………..12-14

2
ABSTARCT
In this paper, we focus on the speed-up of Smith Waterman Algorithm for local
sequence alignment using parallelization approach while leveraging on Residue
Number System implementation of the algorithm. We propose an approach to the
use of Smith-waterman algorithm on GPU Platforms. Accordingly, this paper
tries to take advantage of all the computational resources available due to
parallelization by further using the fast arithmetic operations of RNS to further
improve the overall performance of the algorithm.
1. INTRODUCTION

Local sequence alignment is an important task for bioinformatics, it encompasses


the analysis of pair or more sequences to identify evolutionary association
between them. The Smith-Waterman algorithm (SWA) is one of the most widely
used tools for searching biological sequence databases due to its high sensitivity.
Unfortunately, the Smith-Waterman algorithm is computationally demanding,
which is further compounded by the exponential growth of sequence databases
[1] [2]. Due to the computationally demanding nature of SW algorithms,
alternative heuristics methods such as FASTA [3]and BLAST [4] have been used
to improve the speed of execution at the expense of accuracy.
Biological sequence scanning enables the exploration of the biological
connectivity between two organisms by finding the extent of similarities between
two sequences [5]. SWA algorithms are based on local alignment which makes it
suitable for finding regions of similarities. The analysis of Deoxyribonucleic Acid
(DNA) alignment provides the following benefits:
 Bio-archaeology and evolution (Allows the tracing of evolutionary trends
by finding similarities between any two sequences) [2] [6].
 Molecular medicine (Provides a basis for studying the relationship between
diseases and inheritance) [2][6].

3
 DNA Forensics (proof of identity, crime or catastrophe victims,
establishment of paternity)[6].
 Agriculture or Bio-processing (drought and disease resistant crops, bio-
pesticides, edible vaccines to integrate into agricultural products) [6]

One of the possible ways of speeding up and increasing the sensitivity of


sequence alignment is by optimizing the SWA. Various approaches have been
introduced on accelerating the algorithms both for software and hardware
implementations [7][8]. Several implementations of SWA on parallel GPU
has shown a significant improvement in the efficiency of SW algorithms.
GPU’s are no longer exclusively designed and used for the purpose of
displaying graphics [6], there has been an increase in the usage and
development of applications with large amount of computations.
GPU has the advantage over CPU by having thousands or hundreds of
cores dedicated to performing simultaneous operations [6][9][10]. GPU
parallel nature is focused on computing in the processing stages, while CPUs
are enriched to realize high performance in sequential code [11].
In the current paper we propose an approach that would significantly
improve the implementation of the SW algorithm on GPU using the inherent
fast arithmetic features of Residue Number System.
The rest of the paper is divided in 4 sections. A brief review of related
literature connected to the use of GPU in sequential analysis. The second
section describes the Residue Number System (RNS). The third section
discusses the the new approach using Smith Waterman Algorithm. Finally, the
fourth section concludes the paper.

4
2. LITERATURE REVIEW
The aim of this review is to provide an overview of recent GPU based
sequential analysis methods using Smith Waterman Algorithms, empha-
sizing their advantages (i.e. computational speed-up) as well as drawbacks
(e.g. the necessity of algorithm redesign and tailored implementation to
fully leverage the GPU architecture and its peak performance).
Yuma et. Al [12] Implemented the SWA using compute unified
device architecture. Their methods efficiently shared memory to reduce
data amount being transferred GPU and off-chip memory. The
performance of their implementation is 3 times faster than previous CUDA
implementations.
Yongchao et. al. [13] present a CUDA based implementation for the
SW algorithm. The implementation takes advantage of using the CPU and
GPU SIMD instructions as well as concurrently executing on the CPUs and
GPUs. They presented the CUDASW++3.0 which improves over
CUDASW++2.0. It provides a peak performance 119.0 and 185.6 GCUPS
on a single-GPU and dual-GPU respectively.
Manavski et al. [14] also presented a CUDA based implementation
whose performance reached an ultimate of 3.6 GCUPS. They implemented
the algorithm partly using local memory which is very slow on the GPU
card. The algorithm can further be improved to use the resources available
on the GPU.
Łukasz et al. [15] proposed an implementation of the Smith
Waterman Algorithm using Global memory and shared memory and using
more efficient code. The program is processed concurrently on the CPU
and GPU. It has a peak performance of 14.5 GCUPS on a dual core Nvidia
9800 GX2 card.

5
In this paper, we present and propose the parallel implementation of
the Smith Waterman Algorithm on a GPU using the Residue Number
System fast arithmetic operations.

3. RESIDUE NUMBER SYSTEM


Residue number system is a non-weighted number system, it does not pass
on carries between digits in arithmetic operations. RNS supports parallel,
carry free additions, and borrow free subtractions as well as digit to digit
multiplications without partial product. These properties of RNS that
allows for the independence of computing process for every digit makes
RNS suitable for parallel computations [16] [2][17].
Due to the properties mentioned above, RNS are suitable for solving
problems in Digital Signal Processing [2][16], Fast Fourier
Transformation Transformations [2], digital filtering, image processing,
digital communications [18], adhoc networks, storing and retrieving
information, , error correction and detection, fault tolerant systems [19],
cryptography and Bio-informatics [2][17].
The speed of RNS is mostly determined by the choice of moduli set. There
are several factors to be considered when choosing the moduli set for a
given application. According to M. Deryabin et al. [16], it is necessary to
consider the following when choosing a moduli set:
 Redundancy – reducing the redundancy, since different moduli set
have various degrees of redundancy.
 Balanced Moduli Set – an unbalanced moduli set could result in
delay and waiting time for some threads.
 Cost – moduli set with least computation cost should be chosen.

6
The parallel properties of RNS presents a new challenge in developing
algorithms that would fully utilize the parallel structure of modern Computers
with GPU and FPGA to accelerate several computations.

3.1 COMPUTATIONS IN RNS


Addition, subtraction and multiplications are simple operations in RNS. In
RNS numbers are represented as relatively prime moduli set X = {
m1,m2…, mL} such that gcd(mi,mj) = 1 for 𝑖 ≠ 𝑗, where gcd is the greatest
common divisor of mi and mj and 𝑀 = 𝑚1 × 𝑚2 … 𝑚L is the dynamic
range. When the residues of a decimal number X are represented as 𝑋 =
(𝑥1, 𝑥2, … , 𝑥𝑛) ,0 ≤ 𝑥𝑖 ≤ 𝑚𝑖. This representation is unique for any
integer X in the range [0, M-1].
3.2 DATA CONVERSION IN RESIDUE NUMBER SYSTEM
RNS involves data conversions from binary/decimal to residue number
system before an operation can be performed on them. The operations can
be classified as Forward conversion (conversion from Weighted Number
System to Residue Number System) and Reverse Conversion (conversion
from Residue Number System to Weighted Number System).

3.2.1 FORWARD CONVERSION


This involves the conversion from Binary/Decimal to a Residue Number
System. Forward conversion formula goes thus:
|Xm| = |∑N-1j=0 bj2j|m
For any non-negative integer X in the range 0 < x ≥ 2n-1.

7
3.2.2 REVERSE CONVERSION
Reverse conversion is the conversion of a residue number system to a
weighted number system. The success of reverse conversion is directly
proportional to the forward conversion [2].
Chinese Remainder Theorem (CRT) is one of the approaches in reverse
conversion. The CRT utilizes the formula
X = |∑ni=1|xiMi -1|miMi|M
Where x is represented as {x1,x2….., xn} with moduli set {m1,m2….., mn}.
Another method is the Mixed Radix Conversion (MRC), Let the moduli
set (m1, m2, m3, …, mn) has the corresponding RNS (x1, x2, x3, …, xn) and
a set of digits (a1, a2, a3, …, an) be the mixed radix digits respectively, then
the corresponding decimal equivalent of the residues can be obtained using
the following algorithm:
X =a1 +a2m1 + a3m1m2 + .…
The mixed radix are given by the following:
X = a1 + a2m1 + a3 m1m2
where a1 = x1
a2 = |(x2 - a1)m1-1|m2
a3 = |(x3 - a1)m1-1 – a2m2 -1|m3…
..
ak = |(((xk-a1)|m1 -1|mk ….

3.3 SMITH WATERMAN ALGORITHM


Smith-Waterman algorithm (SMA) is designed to find the uppermost local
alignment between two sequences. It was proposed by Smith and
Waterman in 1981. Given two sequences A = a1a2…an and B= b1b2…bm,
SMA returns an alignment matrix M which shows the optimal local
alignment between the two sequences. Each cell in the alignment matrix

8
depend on its neighbors and the similarities between the current symbol of
sequence A and the symbol of sequence B is computed.
Let
• M(i, j) represents the similarity score of two sequences A and B,
terminating at position i and j;
• S(ai, bj) is the score of comparing sequence Ai to sequence Bi.
The algorithm is given by:
0

M(i-1,j-1) + S(ai, bj)


M(i, j) = max
M(i-1,j) + d

M(i,j-1) + d

Where M(i,0) = 0, M(0,j) = 0


The equation above gives the optimal local alignment of sequence of
sequence A and sequence B with maximum score.

4. PROPOSED APPROACH USING RNS-SWA ARCHITECTURE ON


GPU
One of the most important characteristic of RNS is its ability to
divide a large integer to smaller integers which results in parallel and very
high speed.

The use of parallel processors has greatly improved the performance


of sequence analysis [8]. Parallelizing the implementation of smith
waterman algorithm leveraging Residue Number System inherent
advantages would significantly increase the overall performance on a GPU.

9
Using RNS methods takes the advantage of RNS arithmetic and improves
the speed conversion, depending on the moduli set chosen [2]. M.Nobile et
al. [8] highlighted that there is significant speed up with the use of GPU by
reducing running time as there might still be additional optimization
possible.
Using Moduli set {2n-1, 2n, 2n+1}, the RNS-SWA from [2] is used.
The architecture is shown in Fig 1 below

RNS Processor1
M(i-1,j)
MOD(2n-1)

M(i,j-1)
RNS Processor2
MOD (2n)
M
Binary to RNS
RNS to Binary
Converter
M(i-1,j-1) Converter
(Converter 1)
(Converter 2)
RNS Processor3
MOD(2n+1}
d

S(a,b)
10
Fig 1: Architecture of RNS-SWA
On a GPU the realization of the hardware is based on the following:
 Converter1 accepts the inputs M(i-1,j), M(I,j-1), M(i-1,j-1), d and S(i,j) and
sends to RNS processor.
 The RNS processor sends the result to converter2. The RNS processors
work in parallel using the cores of the GPU. The absence of carry
propagation in RNS will enable realization of high-speed and low-power
consumption.
 Converter2 subsequently converts the latest result to binary/decimal
number, M(i,j) this is done in parallel.
 Maximize independent parallelism in the RNS algorithm in converter 1 and
converter 2 to enable easy partitioning in threads and blocks.

CONCLUSIONS
There has been lots of research and implementations of using Smith Waterman
Algorithm for sequential analysis on GPU but none of them has explored the fast
arithmetic properties of Residue Number System (RNS). In future research we
intend to implement the RNS-SWA architecture using CUDA on a GPU and
compare the speed-up with previous implementations of SWA on a GPU.

11
REFERENCES
[1] F. H. Humed, R. Jidin, R. Othman, M. G. Goorbandi, and S. Noraima,
“Implementing Smith Waterman ’ s Similarity Matrix Computations on
Reconfigurable Logic Hardware,” no. November, pp. 1–5, 2008.

[2] H. Kehinde and K. Alagbe, “Residue Number System: An Important


Application in Bioinformatics,” Int. J. Comput. Appl., vol. 179, no. 10, pp.
28–33, 2018.

[3] David J. Lipman and William R. Pearson, “Rapid and Sensitive Protein
Similarities Searches,” Science (80-. )., vol. 227, no. March, pp. 1435–
1140, 1985.

[4] S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman,


“Basic local alignment search tool,” J. Mol. Biol., vol. 215, no. 3, pp. 403–
410, 1990.

[5] K. Rajalakshmi and R. Nivedita, “VLSI implementation of Smith–


Waterman algorithm for biological sequence scanning,” Lect. Notes
Electr. Eng., vol. 453, pp. 231–245, 2018.

[6] N. M. Trindade, M. Instituto, and S. Técnico, “Efficient GPU


Implementation of Bioinformatics Applications,” no. November, 2014.

[7] N. U. R. Farah et al., “Software Implementation of Smith-Waterman


Algorithm in FPGA Faculty of Electrical Engineering 2 The Smith-
Waterman Algorithm Fill Matrix,” pp. 173–178.

[8] M. S. Nobile, P. Cazzaniga, A. Tangherloni, and D. Besozzi, “Graphics


processing units in bioinformatics, computational biology and systems
biology,” Brief. Bioinform., vol. 18, no. 5, pp. 870–885, 2017.

[9] A. Khalafallah and O. Mahmoud, “Optimizing Smith-Waterman Algoritm


on Graphics processing unit,” no. Icctd, pp. 650–654, 2010.

12
[10] D. Razmyslovich, G. Marcus, M. Gipp, M. Zapatka, and A. Szillus,
“Implementation of Smith-Waterman algorithm in OpenCL for GPUs,”
Proc. 9th Int. Work. Parallel Distrib. Methods Verif. PDMC 2010 - Jt.
with 2nd Int. Work. High Perform. Comput. Syst. Biol. HiBi 2010, pp. 48–
56, 2010.

[11] J. D. Owens et al.,


“ASurveyofGeneralPurposeComputationonGraphicsHardware.pdf,” no.
August, pp. 21–51, 2005.

[12] Y. Munekawa, F. Ino, and K. Hagihara, “Design and Implementation of


the Smith-Waterman Algorithm on the CUDA-Compatible GPU,” no. 2,
2008.

[13] Y. Liu, A. Wirawan, and B. Schmidt, “protein database search by


coupling CPU and GPU SIMD instructions,” BMC Bioinformatics, vol.
14, no. 1, p. 117, 2013.

[14] S. A. Manavski and G. Valle, “CUDA compatible GPU cards as efficient


hardware accelerators for Smith-Waterman sequence alignment,” BMC
Bioinformatics, vol. 9, no. SUPPL. 2, pp. 1–9, 2008.

[15] L. Ligowski and W. Rudnicki, “An efficient implementation of Smith


Waterman algorithm on GPU using CUDA, for massively parallel
scanning of sequence databases,” IPDPS 2009 - Proc. 2009 IEEE Int.
Parallel Distrib. Process. Symp., no. May 2009, 2009.

[16] M. Deryabin, N. Chervyakov, and A. Tchernykh, “High Performance


Parallel Computing in Residue Number System High Performance Parallel
Computing in Residue Number System,” vol. 9, no. February, pp. 62–67,
2018.

[17] K. A. Gbolagade and S. D. Cotofana, “An O(n) residue number system to

13
mixed radix conversion technique,” Proc. - IEEE Int. Symp. Circuits Syst.,
vol. 1, no. 1, pp. 521–524, 2009.

[18] J. Ramírez, A. García, U. Meyer-Baese, and A. Lloris, “Fast RNS FPL-


based Communications Receiver Design and Implementation,” no. 1, pp.
472–481, 2002.

[19] A. Azizifard, M. Qermezkon, T. Postizadeh, and H. Barati, “Data


Steganography on VoIP through Combination of Residue Number System
and DNA Sequences,” vol. 5, no. 2, pp. 7–22, 2014.

14

Das könnte Ihnen auch gefallen