Sie sind auf Seite 1von 23

See

discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/321148703

RNA methylation and diseases: experimental


results, databases, Web servers and
computational models

Article in Briefings in Bioinformatics · November 2017


DOI: 10.1093/bib/bbx142

CITATIONS READS

0 74

6 authors, including:

Xing Chen Lin Zhang


China University of Mining Technology China University of Mining Technology
108 PUBLICATIONS 1,286 CITATIONS 37 PUBLICATIONS 106 CITATIONS

SEE PROFILE SEE PROFILE

Jia Meng
Xi'an Jiaotong-Liverpool University
62 PUBLICATIONS 556 CITATIONS

SEE PROFILE

All content following this page was uploaded by Xing Chen on 20 November 2017.

The user has requested enhancement of the downloaded file.


Briefings in Bioinformatics, 2017, 1–22

doi: 10.1093/bib/bbx142
Paper

RNA methylation and diseases: experimental results,


databases, Web servers and computational models
Xing Chen, Ya-Zhou Sun, Hui Liu, Lin Zhang, Jian-Qiang Li and Jia Meng
Corresponding authors. Xing Chen, School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China.
E-mail: xingchen@amss.ac.cn; Lin Zhang, School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116,
China. E-mail: lin.zhang@cumt.edu.cn; Jian-Qiang Li, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060,
China. E-mail: lijq@szu.edu.cn

Abstract
Ribonucleic acid (RNA) methylation is a type of posttranscriptional modifications occurring in all kingdoms of life. It is
strongly related to important biological process, thus making it linked to a number of human diseases. Owing to the devel-
opment of high-throughput sequencing technology, plenty of achievement had been obtained in RNA methylation research
recently. Meanwhile, various computational models have been developed to analyze and mining increasing RNA methyla-
tion data. In this review, we first made a brief introduction about eight types of most popular RNA methylation, the biologi-
cal functions of RNA methylation, the relationship between RNA methylation and disease and five important RNA
methylation-related diseases. The research of RNA methylation is based on sequencing data processing, and effective bioin-
formatics techniques can benefit better understanding of RNA methylation. We further introduced seven publicly available
RNA methylation-related databases, and some important publicly available RNA-methylation-related Web servers and soft-
ware for RNA methylation site identification, differential analysis and so on. Furthermore, we provided detailed analysis of
the state-of-the-art computational models used in these Web servers and software. We also analyzed the limitations of
these models and discussed the future directions of developing computational models for RNA methylation research.

Key words: RNA methylation; biological function; disease; database; Web server and software; computational model

RNA methylation sequence but regulates its expression. Methylation can occur
Methylation is a form of alkylation in chemistry, which adds a in varieties of biomolecules including deoxyribonucleic acid
methyl group on a substrate or substitutes the original atom or (DNA), ribonucleic acid (RNA) and proteins.
group. In biological systems, methylation reaction is catalyzed DNA methylation occurs on the carbon 5 of the pyrimidine
by a set of methyltransferases [1]. It contributes to epigenetic ring of cytosines. It is established by methyltransferases
alterations as structural modification that does not affect gene DNMT3A and DNMT3B, and is maintained by DNMT1. Global

Xing Chen, PhD, is a professor of School of Information and Control Engineering, China University of Mining and Technology. He is also the founding direc-
tor of Institute of Bioinformatics, China University of Mining and Technology. His research interests include disease, noncoding RNAs, network pharma-
cology, complex network and machine learning.
Ya-Zhou Sun, PhD, is a postdoctor of College of Computer Science and Software Engineering, Shenzhen University. Her research interests include disease,
noncoding RNAs, genomics, DNA damage repair and precise medicine.
Hui Liu, PhD, is an associate professor of School of Information and Control Engineering, China University of Mining and Technology. His interests include
noncoding RNAs, computational biology and machine learning.
Lin Zhang, PhD, is an associate professor of School of Information and Control Engineering, China University of Mining and Technology. Her interests
include computational biology, statistical signal processing and Bayesian methods.
Jian-Qiang Li, PhD, is an associate professor of College of Computer Science and Software Engineering, Shenzhen University. He is also the executive direc-
tor of Institute of Network and Information Security, Shenzhen University. His research interests include bioinformatics, artificial intelligence, mobile
medical and complex systems.
Jia Meng, PhD, is an associate professor of Department of Biological Sciences, Xi’an Jiaotong-Liverpool University. His interests include epitranscriptome
bioinformatics, statistical modeling and NGS data mining.
Submitted: 4 August 2017; Received (in revised form): 12 September 2017
C The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
V

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
2 | Chen et al.

and gene-specific patterns of DNA methylation are often noncoding RNAs [14]. The m5C is involved in metabolic proc-
dynamic in many important biological processes, which are esses including energy and lipid metabolism. Several m5C
closely related to disease. Thus, it has been an intensive area of methyltransferases were thought to work on rRNA and tRNA
research for the past 30 years [2]. and have binding sites on mRNA, suggesting that they take
Protein methylation has been widely studied in the histones. additional roles that impact mRNA [15, 16].
Methylation of histones occurs on lysines, arginines and histi-
dines, in which lysine methylations are by far the best charac- 2’-O-methylation
terized. The patterns of histone methyl marks are altered in Ribose 2’-O-methylation occurs in rRNA, tRNA, mRNA, snoRNA
disease development, especially malignancies. Mutations in and small interfering RNA, etc., at adenosine (A), guanosine (G),
specific histone methylatransferases, demethylase and associ- cytidine (C) and uridine (U) nucleobases [17] and is ubiquitous in
ated factors also have been reported in many cancers [3]. viruses, archaebacteria, eubacteria, yeasts, protists, fungi and
RNA is the intermediate molecule, which links genetic infor- higher eukaryotes [18]. 2’-O-methylation is involved in discrimi-
mation contained in genes to its expression in functional pro- nation of mRNA [19]. The function of 2’-O-methylation is also
teins. In the past decade, noncoding RNA has been added as suggested to protect the 30 end of miRNA to protect the 30 end of
new players, which is likely to be further extended by the miRNA against polyuridylation preventing miRNA from poly(U)-
improvement of sequencing technologies. Methylation is mediated degradation [20].
involved in many steps of RNA biology and occurs in diverse
RNA species such as transfer RNA (tRNA), ribosomal RNA N7-methylguanine
(rRNA), messenger RNA (mRNA), transfer–messenger RNA, The N7-methylated G cap structure is found at the 50 ends of
small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), mature eukaryotic mRNAs. It is linked by an inverted 50 -50 tri-
microRNA (miRNA) and viral RNA. It is thought that RNA meth- phosphate bridge to the first nucleotide of the nascent tran-
ylation may have existed before DNA methylation in the early script. The 50 -m7G cap structure plays a critical role in the life
stages of life evolving on earth [4]. In most cases, the function cycle of eukaryotic mRNA and is necessary for efficient gene
and biological consequences of these methylations remain elu- expression and cell viability from yeast to human. It serves as
sive. However, thanks to the development of new analysis tools, both a positive and negative element in mRNA recruitment to
the field of RNA nucleotide methylation is emerging. In recent stimulate canonical translation initiation while preventing
years, the idea that dynamic RNA methylation plays active roles binding to the ribosome via an alternative pathway [21].
in gene regulation has been intensively studied. The molecular
function of enzymes involved in RNA methylation has also been N1-Methyladenosine
uncovered. These observations point to an important role of RNA The methylation on the N1 atom of A to form 1-methyladeno-
methylation in cellular process and call for this area to be further sine has been found in tRNA. In cytosolic tRNA, the m1A modifi-
studied from both theory and application of biomedicine. cation occurs at five different positions 9, 14, 22, 57 and 58. The
most well-studied m1A modifications are those occurring at
RNA methylation classification nucleotide positions 9 and 58. The mechanism for formation of
RNA methylation occurs in all species of organisms. Methylated m1A has not yet been determined but is known to rely on a
RNA nucleotides are ubiquitous in life, and roughly two-third of number of residues such as aspartate and glutamine in all fami-
the >100 chemically distinct RNA modifications involve the lies. The m1A modification plays a number of biological roles,
addition of methyl groups [5, 6]. However, the distribution of the for example, enhancing structural stability and inducing correct
different types of methylated nucleotides in different species is not folding of the tRNA [22].
uniform. For example, the methylated nucleotides Am, m1I, m2G
and m22 G are shared among eukaryotes and archaea, m5Um and Pseudouridine
m3C have not yet been detected in archaea, while m62 A, N6-meth- Pseudouridine (W) was discovered over 60 years ago [23]. W mod-
yladenosine (m6A), N1-methyladenosine (m1A), Cm, 5-methylcyto- ification provides an additional hydrogen-bonding donor that
sine (m5C), Gm, m1G, N1-methyladenosine (m7G), m5U and Um can significantly affect the secondary structure of RNA. More
form a pool common to archaea, eukaryotes and prokaryotes [7]. recently, transcriptome-wide mapping has uncovered hundreds
of naturally occurring W sites in human mRNA [24]. These sites
N6-Methyladenosine are responsive to nutrition starvation and heat shock, suggesting
m6A was initially discovered in 1974. It was the first internal pseudouridylation as a potential mechanism to rapidly adapt the
mRNA modification discovered and most prevalent in eukary- translation landscape to environmental stress [25, 26].
otic mRNA [8]. Early studies showed that every mammalian
mRNA on average contains three to five m6A within consensus 5-Hydroxymethylcytosine
sequence. The m6A is installed by a methyltransferase complex In 2009, Rao and colleagues found that human ten-eleven trans-
[9]. The identification of a subunit METTL3 of the complex location (TET) proteins can oxidize 5mC to generate 5-hydroxy-
allowed scientists to examine m6A in model organisms [10]. The methylcytosine (5hmC). Every mammalian cell seems to
m6A methylation or demethylation activities have been shown contain 5hmC, but the levels vary significantly depending on
to affect stability of transcriptional regulators and provide a the cell type [27]. Though the exact function of 5hmC is not fully
dynamic and rapid response to cellular signals, environmental elucidated, it is thought that it may regulate gene expression.
stimuli or programmed biological transformations [11–13]. The 5hmC may be especially important in the central nervous
system, as it is found in high levels there. Reduction in the
5-Methylcytosine 5hmC levels has been found to be associated with impaired
m5C is a well-known epigenetic modification in rRNA and tRNA. self-renewal in embryonic stem cells (ESCs). It is also associated
Recent transcriptome-wide mapping of m5C in human RNA has with unstable nucleosomes, which are frequently repositioned
uncovered >10 000 candidate m5C sites in mRNA and other during cell differentiation [28].

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
RNA methylation and diseases | 3

Table 1. Functions of the various types of RNA methylation

RNA methylation type Main functions

m6A Stability of transcriptional regulators [12], RNA splicing [38], translation [11], cell differentiation [39],
circadian clock [40], DNA repair [41], sex determination [42], viral infection regulation [43], response
to cellular signals, environmental stimuli [44] or programmed biological transformations [13]
m5C Metabolic processes [16], mRNA process [15], stress response [45]
2’-O-methylation Discrimination of mRNA [19], miRNA stability [20], viral infection regulation
m7G Life cycle of eukaryotic mRNA, translation initiation, mRNA transport, splicing and degradation [21]
m1A Structural stability and correct folding of the tRNA [22], translation initiation [46]
W Translation [25], response to nutrition starvation and heat shock [47]
5hmC Self-renewal in ESCs, cell differentiation [28]
A-to-I editing RNA destabilization, changes in the folding of RNA [34], immune responses [35]

Adenosine to inosine editing demonstrated that m6A-mediated mRNA structure remodeling


Adenosine to inosine editing (A-to-I editing) is a cotranscrip- affected the binding to HNRNPC, which was an abundant
tional process that contributes to transcriptome complexity by nuclear RNA-binding protein responsible for pre-mRNA proc-
deamination of adenosines to inosines [29]. It is accomplished essing and alternative splicing [38]. These data provide strong
by adenosine deaminases acting on RNAs (ADARs) [30]. The evidence on a mechanistic relationship between the presence
most recent deep-sequencing study suggests that >100 million of m6A and splicing events.
sites in the human transcriptome might be subjected to A-to-I
editing [31]. A few hundred A-to-I editing events can recode mRNA translation
mRNAs, thus resulting in different proteins translation from The enrichment of m6A in exons and around the stop codon
their genomically encoded versions [32]. The other millions of regions makes it conceivable that m6A may regulate translation.
editing events are largely located in the noncoding RNAs [33]. In a recent study performed in mouse ESCs and embryonic
The biological consequences of these editing events are only bodies, m6A writer METTL3 ablation significantly increased
partly understood, which may include RNA destabilization, translation efficiency, indicating a regulatory role of m6A in
changes in the folding of RNA or inosine-dependent suppres- translation [39]. The m6A reader, YTHDF1, was reported to inter-
sion of immune responses [34, 35]. act with initiation factors and ribosomes to increase transla-
tional output, presenting direct evidence for translational
RNA methylation function regulation functions of m6A [54]. One of the translation factors,
eIF3, was also reported to directly bind 50 untranlated region
The different function of methyl groups in RNA include bio-
(UTR) m6A, which was sufficient to recruit the 43 S complex to
physical, biochemical and metabolic stabilization of RNA; qual-
initiate translation in the absence of the cap-binding factor
ity control; resistance to antibiotics; mRNA reading frame
maintenance; deciphering of normal and altered genetic code; eIF4E [55]. m1A is also a widespread and conserved posttran-
selenocysteine incorporation; tRNA aminoacylation; ribotoxins; scriptional modification that is associated with translation ini-
splicing; intracellular trafficking; immune response; gene regu- tiation in thousands of mammalian transcripts characterized by
lation; DNA repair; stress response; and possibly histone acety- structured 50 UTR [46, 56].
lation [36, 37]. In what follows, we will review the most
important aspects of RNA methylation with what is known of Extensive translation of circular RNAs
their function (Table 1). Extensive pre-mRNA back-splicing generates numerous circular
RNAs (circRNAs) in human transcriptome. Recently, Yang et al.
Transcription and RNA splicing reported that m6A promotes efficient initiation of protein trans-
m6A modification exists in the mRNAs of various kinds of lation from circRNAs in human cells. Further analyses through
viruses. Occurrence of m6A in viral mRNA was shown to polysome profiling, computational prediction and mass spec-
enhance the priming efficiency of mRNA [48–50]. Besides tran- trometry revealed that m6A-driven translation of circRNAs is
scription efficiency, transcription kinetics are also likely wide spread, with hundreds of endogenous circRNAs having
affected by m6A modification. In human, antibiotic-induced translation potential. This expands the coding landscape of
deafness is caused by pathogenic mutation A1555G in mito- human transcriptome, and suggests a role of circRNA-derived
chondrial genomic, which is located in close proximity to the proteins in cellular responses to environmental stress [57].
m6A modification site, which establishes a link between human
disease, mitochondrial transcription and 12 S rRNA methylation Cell fate transition
[51]. Pre-mRNA splicing is an essential step in gene expression. ESCs are pluripotent stem cells derived from the inner cell mass
It involves precise excision of introns and joining of exons from of a preimplantation embryo, exhibiting prolonged undifferenti-
primary transcripts in the nucleus to generate mature mRNA ated proliferation and stable developmental potential to form
[52]. Emerging evidences support the correlation of m6A with derivatives of all three embryonic germ layers [58]. The transi-
RNA splicing. The regulatory role of m6A in mRNA splicing was tion from naı̈ve pluripotency to differentiation is tightly regu-
reported in the study of fat mass and obesity-associated protein lated by a plethora of pluripotency markers and developmental
(FTO)-depleted 3T3-L1 preadipocytes. The researchers found factors. Transcriptome-wide m6A profiling in mouse embryonic
that enhanced m6A level in response to FTO depletion promotes stem cells (mESCs) showed that the majority of these core
RNA-binding ability of splicing regulatory protein SRSF2, leading pluripotent genes and developmental regulators have m6A
to increased inclusion of target exons [53]. A recent study also modifications on their transcripts [59]. Recently, Geula et al. [39]

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
4 | Chen et al.

demonstrated that the m6A modification plays a key role in adolescents and healthy elderly subjects. Analysis of mRNA
facilitating transition of human embryonic stem cells (hESCs) methylation in dopaminergic neurons following FTO loss of
from the naı̈ve state to the primed state on differentiation. The function identified a subset of mRNAs whose m6A levels were
maternal-to-zygotic transition (MZT) is one of the most pro- influenced by FTO [64]. Many of these transcripts encode pro-
found and tightly orchestrated processes during the early life of teins involved in the response to dopamine, suggesting that
embryos. Over one-third of zebrafish maternal mRNAs can be FTO-mediated dynamic methylation of neuronal mRNAs is nec-
m6A modified. Removal of YTHDF2 in zebrafish embryos decel- essary for proper dopaminergic signaling. Loss of DNMT2-
erates the decay of m6A-modified maternal mRNAs and mediated m5C methylation increases tRNA stress-induced
impedes zygotic genome activation. These embryos fail to ini- cleavage in flies and cleavage of tRNAs, and repression of pro-
tiate timely MZT, undergo cell cycle pause and remain develop- tein translation is a conserved response to several stress stimuli
mentally delayed throughout larval life [60]. in eukaryotes. Nuerodevelopmental disorders are commonly
associated with oxidative stress, and increased tRNA cleavage
Circadian clock has been recently directly linked to neurodevelopmental and
The mechanism of the mammalian circadian clock involves a neurodegenerative conditions [65].
negative transcription–translation feedback loop in which the
transcription of the clock genes is suppressed by their own Sex determination
encoded proteins. Recent work showed that inhibition of transme- In Drosophila, fl(2)d and vir are required for sex-dependent regu-
thylation reactions elongates the circadian period. RNA sequenc- lation of alternative splicing of the sex determination factor sex
ing (RNA-seq) revealed methylation inhibition causes widespread lethal (Sxl). m6A is required for female-specific alternative splic-
changes in the transcription of the RNA processing machinery, ing of Sxl, which determines female physiognomy, but also
associated with RNA m6A-methylation. Specific inhibition of m6A translationally represses male-specific lethal 2 (msl-2) to pre-
methylation by silencing of METTL3 is sufficient to elicit circadian vent dosage compensation in females [42, 66].
period elongation and RNA processing delay [40].
Virus infection
DNA damage response Viral life cycles are usually regulated by precise mechanisms
Cell proliferation and survival require the faithful maintenance that act on their RNA [67]. The m6A was found on RNA of sev-
and propagation of genetic information, which are threatened eral viruses in 1970 s and hypothesized a new RNA regulatory
by the ubiquitous sources of DNA damage present intracellu- control to viral infection [68]. Recently, a proviral role for m6A
larly and in the external environment. DNA damage response in HIV-1 infection has been found. The function of individual
detects and repairs damaged DNA and prevents cell division m6A sites in HIV-1 RNA can be varied from regulating HIV-1
until the repair is complete [61]. A recent study uncovered m6A RNA nuclear export to enhancing viral gene expression [43, 69,
in RNA is rapidly and transiently induced at DNA damage sites 70]. Using m6A-seq, m6A modifications were also mapped in
in response to ultraviolet irradiation. This modification occurs several regions across the RNA of the Flaviviridae members
on numerous poly(A)þ transcripts and is regulated by METTL3 hepatitis C virus (HCV), Zika virus (ZIKV), dengue virus, yellow
and FTO. m6A RNA serves as a beacon for the selective, rapid fever virus and West Nile virus [71, 72]. In addition, the
recruitment of Pol j to DNA damage sites to facilitate repair and Kaposi’s sarcoma-associated herpesvirus (KSHV), mRNAs also
cell survival [41]. The recruitment of methyl-CpG-binding undergo m6A modification. The blockage of m6A inhibited
domain protein 2 (MBD2) to DNA damage sites after laser micro- splicing of the pre-mRNA, a key KSHV lytic switch protein, rep-
irradiation also suggests that RNA methylation is related to lication transcription activator, and halted viral lytic replica-
laser-induced DNA damage response [62]. tion [73]. The m6A on viral RNAs may prevent detection by
host pattern recognition receptors that trigger antiviral innate
Heat shock response immunity. It may serve as a shield on viral RNA to prevent
The researchers also found that diverse cellular stresses induction of antiviral signaling pathways, which is important
induced a transcriptome-wide redistribution of m6A, resulting for the therapy of pathogen-associated diseases.
in increased numbers of mRNAs with 50 UTR m6A, which thus
presented a concept of dynamic m6A events in response to
RNA methylation and disease
stress. A connection between tRNA methylation and stress
response has been evidenced for Dnmt2 mediate formation of The study of RNA methylation has emerged as an exciting new
m5C38 in tRNAs in Drosophila melanogaster [45]. Recently, find- research area over the past few years. It might represent an
ings show that a few W sites in yeast U2 snRNA are induced by additional layer of gene regulation, leading to the coining of
nutrient deprivation or heat shock. Hundreds of mRNA W are the terms, ‘RNA epigenetics’ and ‘epitranscriptomics’. The
also induced by heat shock in yeast possibly affecting transcript direct studies of the role of RNA methylation in disease have
stability [63]. In mammalian cells, m6A is preferentially depos- been rare; however, the methylases, demethylases and other
ited to the 50 UTR of newly transcribed mRNAs in response to related factors have been shown to have disease correlations.
heat shock stress. The increased 50 UTR methylation in the form Here, we summarize our current knowledge about the genes
of m6A promotes cap-independent translation initiation, pro- directing these modifications in human disease (Table 2).
viding a mechanism for selective mRNA translation under heat Although RNA methylation research is still in its early stages,
shock stress [44]. disruption of RNA methylation has been linked to a number of
disease conditions. It suggests RNA methylation important
Neuronal functions pathogenesis and independent factor during the progression
Humans with a nonsynonymous mutation in the FTO enzy- of diseases. Considering that the biological effects of RNA
matic domain exhibit brain malformation and impaired brain methylation in different diseases are vicarious, we also sum-
function, and intronic FTO single-nucleotide polymorphisms marize the RNA methylation regulating genes in different dis-
have been associated with abnormal brain volumes in both eases (Table 3). Here, we give examples for five important

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
RNA methylation and diseases | 5

Table 2. Human disease associated with RNA methylation factors

RNA methylation type Factors Human disease

m6A WTAP (m6A writer) Hypospadias [126]


Acute myelogennous leukemia [127]
Cholangiocarcinoma [128]
FTO (m6A eraser) Obesity [75]
Coronary heart disease [129]
Type 2 diabetes [130]
Cancer [131]
ALKBH5 (m6A eraser) Infertility [132]
Major depressive disorder [133]
m5C NSUN2 (m5C RNA methyltransferase) Breast cancer [134]
Autosomal recessive intellectual disability [135]
Amyotrophic lateral sclerosis [61]
Parkinson’s disease [78]
W DKC1 (RNAW Synthase) Diskeratosis congenita [136]
Pituitary tumorigenesis [137]
Prostate cancer [138]
PUS1 (RNAW Synthase) Mitochondrial myopathy, lactic acidosis and sideroblastic
Anemia [139]
A-to-I editing ADAR1 (adenosine deaminase) Chronic myeloid leukemia [140]
Metastatic melanoma [141]
Human hepatocellular carcinoma [142]
Esophageal squamous cell carcinoma [143]
ADAR2 (adenosine deaminase) Glioblastoma multiforme [144]
Alzheimer’s disease [145]

Table 3. The role of RNA methylation in different genes with heterogeneity of disease

RNA methylation type Related genes Relative level in disease Role in disease Human disease

6
m A RUNX1T1 Hypo Disease progression Obesity [74]
miR-125b Hyper Tumor progression Cancer [75]
IDH1/2 Hyper Tumor progression Acute myeloid leukemia [76]
m5C tRNA Hypo Disease progression Amyotrophic lateral sclerosis [64]
Parkinson’s disease [77]
W TERC Hypo Disease diagnosis Diskeratosis congenita [25]
2’-O-methylation FTSJ1 Hypo Disease progression X-linked intellectual disability [78]
A-to-I editing PU.1 Hyper Tumor progression Chronic myeloid leukemia [79]
AZIN1 Hyper Tumor progression Human hepatocellular carcinoma [80]
GluA2 Hypo Disease diagnosis Alzheimer’s disease [81]

human diseases including obesity, neurodevelopmental disor- the main upstream regulator of angiogenin-dependent tRNA
ders, cancer, dyskeratosis congenita and X-linked intellectual binding and cleavage. tRNAs lacking cytosine-5 methylation are
disability [82]. prone to be cleaved by angiogenin, and altered tRNA cleavage
because of mutation is also linked to neurodegenerative disease,
Obesity such as amyotrophic lateral sclerosis and Parkinson’s [64, 77, 86].
Genome-wide association studies linked common variants of
FTO gene with childhood and adult obesity in 2007 [74, 83, 84]. Cancer
The finding that FTO-mediated m6A demethylation controls Proteinase-activated receptor 2 (PAR2) participates in cancer
exonic splicing of adipogenic regulatory factor RUNX1T1 empha- metastasis promoted by serine proteinases. The PAR2 activation
sized the regulatory role of FTO in adipogenesis [74, 84]. also represses miR-125 b expression, while miR-125 b mimic
successfully blocks PAR2-induced cell migration. PAR2 activa-
Neurodevelopmental disorders tion increases the level of m6A-containing pre-miR-125 b in
Hereditary forms of intellectual disability are neurodevelop- NSun2-dependent manner. NSun2-dependent RNA methylation
mental disorders [85]. Loss of cytosine-5 RNA methylation contributes to the downregulation of miR-125 b to regulate can-
increases the angiogenin-mediated endonucleolytic cleavage of cer cell migration by altering miRNA expression [875].
tRNA leading to an accumulation of 50 tRNA-derived small RNA
fragments. Accumulation of 50 tRNA fragments in the absence Dyskeratosis congenita
of methyltransferase NSun2 reduces protein cell size and Dyskeratosis congenita can be caused by mutations in the non-
increased apoptosis of cortical, hippocampal and striatal neu- coding RNA (ncRNA) telomerase component TERC. The reads
rons. Cytosine-5 methylation at the variable loop of tRNAs act as distribution across TERC revealed a putative W site at position

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
6 | Chen et al.

307. W-seq of the hybrid-captured RNA confirmed substantial methylated RNA immunoprecipitation sequencing (MeRIP-Seq),
pseudouridylation of position 307, a highly conserved uridine in a recently developed technology for interrogating m6A methyl-
a region essential for telomerase activity and TERT binding, and transcriptome. MeT-DB includes 300k m6A methylation sites
showed that it is modified at significantly higher levels in the in 74 MeRIP-Seq samples from 22 different experimental condi-
control sample than in the patient sample. This suggests that tions predicted by exomePeak and MACS2 algorithms. To
TERC pseudouridylation may be compromised in dyskeratosis explore this rich information, MeT-DB also provides a genome
congenita, and provides a general way to quantify W in lowly browser to query and visualize context-specific m6A methyla-
expressed genes [25]. tion under different conditions. MeT-DB also includes the bind-
ing site data of miRNA, splicing factor and RNA-binding
X-linked intellectual disability proteins in the browser window for comparison with m6A sites
Mutations in human FTSJ1 can cause nonsyndromic X-linked and for exploring the potential functions of m6A.
intellectual disability (NSXLID). The tRNAPhe from two geneti-
cally independent cell lines of NSXLID patients with loss of
RMBase
function FTSJ1 mutations nearly completely lacks 2’-O-methy-
lated C32 (Cm32) and 2’-O-methylated G34 (Gm34), and has RMBase (RNA Modification Base) is developed to decode the
reduced peroxywybutosine. These directly link defective 2’-O- genome-wide landscape of RNA modifications identified from
methylation of the tRNA anticodon loop to FTSJ1 mutations, high-throughput modification data generated by 18 independ-
suggesting that the modification defects cause NSXLID, and ent studies (http://mirlab.sysu.edu.cn/rmbase/) [89]. The current
may implicate Gm34 of tRNAPhe as the critical modification [78]. release of RMBase includes 9500 W modifications generated
from Pseudo-seq and CeU-seq sequencing data, 1000 m5C pre-
dicted from Aza-IP data, 124 200 m6A modifications discovered
Databases from m6A-seq and 1210 2’-O-methylations identified from
RNAMDB RiboMeth-seq data and public resources. Moreover, RMBase pro-
vides a comprehensive listing of other experimentally sup-
The RNA Modification Database (RNAMDB) has served as a focal ported types of RNA modifications by integrating various
point for information pertaining to naturally occurring RNA resources.
modifications (http://rna-mdb.cas.albany.edu/RNAmods/) [6]. In
its current state, the database uses an easy-to-use, searchable
REDIportal
interface to obtain detailed data on the 109 currently known
RNA modifications. Each entry provides the chemical structure, REDIportal is the largest and comprehensive collection of RNA
common name and symbol, elemental composition and mass, editing in humans including >4.5 million of A-to-I events
CA registry numbers and index name, phylogenetic source, type detected in 55 body sites from thousands of RNA-seq experi-
of RNA species in which it is found and references to the first ments (http://srv00.recas.ba.infn.it/atlas/) [90]. REDIportal
reported structure determination and synthesis. embeds RADAR database and represents the first editing
resource designed to answer functional questions, enabling the
MODOMICS inspection and browsing of editing levels in a variety of human
samples, tissues and body sites. In contrast with previous RNA
MODOMICS is a database of RNA modifications that provides
editing databases, REDIportal comprises its own browser
comprehensive information concerning the chemical structures
(JBrowse) that allows users to explore A-to-I changes in their
of modified ribonucleotides, their biosynthetic pathways, RNA-
genomic context, empathizing repetitive elements in which
modifying enzymes and location of modified residues in RNA
RNA editing is prominent.
sequences (http://modomics.genesilico.pl) [5]. It integrates
information about the chemical structure of modified nucleoti-
des, their localization in RNA sequences, pathways of their bio- Web server and software
synthesis and enzymes that carry out the respective reactions. Sequence-based site prediction Web server or software
MODOMICS also provides literature information, and links to
other databases, including the available protein sequence and HAMR
structure data. HAMR is a high-throughput method to map RNA modifications
within all classes of RNAs by identifying mis-incorporation of
RADAR nucleotides by reverse transcriptase (RT) during production of
complementary DNA (cDNA) products (http://wanglab.pcbi.
RADAR includes a comprehensive collection of A-to-I RNA edit-
upenn.edu/hamr) [91]. Users may submit a link to a remote
ing sites identified in humans (Homo sapiens), mice (Mus muscu-
indexed BAM (read alignment) file to the online version of
lus) and flies (D. melanogaster), together with extensive manually
HAMR. HAMR detects candidate modification sites either
curated annotations for each editing site (http://RNAedit.com)
transcriptome-wide or at selected loci specified by transcript ID
[87]. RADAR also includes an expandable listing of tissue-
or genomic coordinates. Users may also opt to filter out known
specific editing levels for each editing site, which will facilitate
dbSNP sites for human data and select various options affecting
the assignment of biological functions to specific editing sites.
the stringency of the analysis, including P-value or false discov-
ery rate (FDR) thresholds, minimum coverage and which null
MeT-DB
hypothesis to use.
The MethylTranscriptome DataBase (MeT-DB) is the first com-
prehensive resource for m6A in mammalian transcriptome M6Apred
(http://compgenomics.utsa.edu/methylation/) [88]. It includes a M6Apred is a support vector machine (SVM)-based model to
database that records publicly available data sets from identify m6A sites in the Saccharomyces cerevisiae transcriptome

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
RNA methylation and diseases | 7

by using the nucleotide chemical property and nucleotide den- SRAMP


sity information (http://lin.uestc.edu.cn/server/m6Apred.php) To depict the sequence context around m6A sites, SRAMP com-
[92]. In this model, RNA sequences are encoded by their nucleo- bines three Random Forest classifiers that exploit the positional
tide chemical property and accumulated nucleotide frequency nucleotide sequence pattern, the k-nearest neighbor (kNN)
information. information and the position-independent nucleotide pair spec-
trum features, respectively (http://www.cuilab.cn/sramp/) [99].
iRNA-Methyl SRAMP accepts either genomic sequences or cDNA sequences
iRNA-Methyl formulates RNA sequences with the ‘pseudo dinu- as its input. It only requires nucleotide sequences for prediction.
cleotide composition’ (PseDNC), which incorporates three RNA Users can select either the full transcript mode or the mature
physiochemical properties (http://lin.uestc.edu.cn/server/iRNA- mRNA mode, depending on whether they have the genomic or
Methyl) [93]. It was observed by the rigorous cross-validation the cDNA sequence at hand, and whether they are interested in
test on the benchmark data set that the accuracy achieved by the intronic m6A sites. Users can also decide whether the RNA
the predictor in identifying m6A was 65.59%. All benchmark secondary structure should be considered. Analysis of RNA sec-
data can be downloaded from the Data window of this Web ondary structures provides text and graphical representation of
server. the local structure around the predicted m6A site.

PPUS RAMPred
PPUS is the first Web server to predict pseudo uridine synthase RAMPred is proposed to identify m1A sites in H. sapiens, M. mus-
(PUS)-specific W sites (http://lyh.pkmu.cn/ppus/) [94]. PPUS used culus as well as S. cerevisiae genomes for the first time (http://lin.
SVM as the classifier and used nucleotides around W sites as the uestc.edu.cn/server/RAMPred) [100]. In this method, RNA
features. Currently, PPUS could accurately predict new W sites sequences are encoded by using nucleotide chemical property
for PUS1, PUS4 and PUS7 in yeast and PUS4 in human. and nucleotide compositions.

AthMethPre iRNA-PseU
AthMethPre is a method to predict the m6A sites for Arabidopsis The Web server iRNA-PseU was developed to identify the W sites
thaliana mRNA sequence(s) (http://bioinfo.tsinghua.edu.cn/ in H. sapiens, M. musculus and S. cerevisiae (http://lin.uestc.edu.
AthMethPre/index.html) [95]. To predict the m6A sites of an mRNA cn/server/iRNA-PseU) [101]. It incorporated the chemical proper-
sequence, the SVM was used to build a classifier using the features ties of nucleotides and their occurrence frequency density dis-
of the positional flanking nucleotide sequence and position- tributions into the general form of pseudo K-tuple nucleotide
independent k-mer nucleotide spectrum. The server also provides composition (PseKNC).
a comprehensive database of predicted transcriptome-wide m6A
sites and curated m6A-seq peaks from literatures for query and MethyRNA
visualization. MethyRNA is an SVM-based model to identify m6A sites by
encoding RNA sequence using nucleotide chemical property
RNAMethPre and frequency based on the high-resolution experimental data
RNAMethPre integrated multiple features of mRNA (flanking of H. sapiens and M. musculus (http://lin.uestc.edu.cn/server/
sequences, local secondary structure information and relative methyrna) [102]. It was observed by the rigorous cross-
position information) and trained a SVM classifier to predict validation test with accuracy of 90.38 and 88.89% for identifying
m6A sites in mammalian mRNA sequences (http://bioinfo.tsing m6A in former mentioned species, respectively.
hua.edu.cn/RNAMethPre/index.html) [96]. Given an mRNA as
well as its corresponding species information, the server returns RAM-ESVM
all predicted m6A sites to users. The results are also download- RAM-ESVM was developed for detecting m6A sites from S. cerevi-
able for further analysis. The SVM model was also applied to siae transcriptome, which used ensemble SVM classifiers and
predict transcriptome-wide m6A sites. Experimental m6A-seq novel sequence features (http://server.malab.cn/RAM-ESVM/)
peaks were collected from literatures. The Web server was built [103]. RAM-ESVM combined three basic classifiers, namely,
to provide both prediction and query services for m6A sites. A SVM-PseKNC, SVM-motif and GkmSVM, which were con-
genome browser was also built based on JBrowse to visualize structed by using PseKNC, motif features and optimized k-mer
the query results. as discriminal features, respectively.

M6ATH RAM-NPPS
M6ATH is an SVM-based method proposed to identify m6A sites RAM-NPPS is a sequence-based predictor for identifying m6A
in A. thaliana transcriptome (http://lin.uestc.edu.cn/server/ sites within RNA sequences (http://server.malab.cn/RAM-NPPS/)
M6ATH) [97]. The proposed method was validated on a bench- [104]. Users can submit uncharacterized RNA sequences to iden-
mark data set using jackknife test and was also validated by tify the potential m6A sites. In particular, the online predictor
identifying strain-specific m6A sites in A. thaliana. For the con- provides m6A site identification specific for three species, such
venience of scientific community, a freely accessible online as S. cerevisiae, H. sapiens and A. thaliana.
Web server was established.
iRNA-AI
PRNAm-PC iRNA-AI is a predictor to identify A-to-I editing sites based on
In pRNAm-PC, RNA sequence samples are expressed by a novel the RNA sequence information alone (http://lin.uestc.edu.cn/
mode of PseDNC whose components were derived from a phys- server/iRNA-AI/) [105]. It has been proposed by incorporating
ical–chemical matrix via a series of auto-covariance and the chemical properties of nucleotides and their sliding occur-
cross-covariance transformations (http://www.jci-bioinfo.cn/ rence density distribution along a RNA sequence into the gen-
pRNAm-PC) [98]. eral form of pseudo nucleotide composition (PseKNC).

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
8 | Chen et al.

iRNA-PseColl DRME
The iRNA-PseColl was formed by incorporating both the individ- DRME is designed for differential RNA methylation analysis
ual and collective features of the sequence elements into the from the MeRIP-Seq data set at small sample size scenario using
general PseKNC of RNA via the chemicophysical properties and the negative binomial model (https://github.com/lzcyzm/DRME)
density distribution of its constituent nucleotides (http://lin. [110]. The model not only captures within-group biological vari-
uestc.edu.cn/server/iRNA-PseColl) [106]. It was developed to ability among replicates but also addresses the changes in RNA
identify RNA modifications in H. sapiens transcriptome. At expression level and its impact on RNA methylation, and thus
present, the m1A, m6A and m5C can be identified based on the can be applied to MeRIP-Seq, particularly for differential RNA
current platform. methylation analysis. The algorithm is also fast to execute
and in theory can be applicable to other data types related to
RNA such as RNA bisulfite sequencing and photoactivatable
Next-generation sequencing (NGS) data-based site
ribonucleoside-enhanced crosslinking and immunoprecipita-
detection Web server or software tion (PAR-CLIP) without reads count rescaling or normalization.
MeRIP-PF
MeRIP-PF is a novel high-efficiency and user-friendly analysis MeTPeak
pipeline for the signal identification of MeRIP-Seq data in refer- MeTPeak is a novel, graphical model-based peak-calling method
for transcriptome-wide detection of m6A sites from MeRIP-seq
ence to controls (http://software.big.ac.cn/MeRIP-PF.html) [107].
data (https://github.com/compgenomics/MeTPeak) [111].
MeRIP-PF provides a statistical P-value for each identified m6A
MeTPeak explicitly models read count of an m6A site and intro-
region based on the difference of read distribution when com-
duces a hierarchical layer of beta variables to capture the var-
pared with the controls and also calculates FDR as a cutoff to
iances and a hidden Markov model (HMM) to characterize the
differentiate reliable m6A regions from the background.
reads dependency across a site. In addition, a constrained
Furthermore, MeRIP-PF also achieves gene annotation of m6A
Newton’s method and a log-barrier function are developed to
signals or peaks and produces outputs in both XLS and graphi-
estimate analytically intractable, positively constrained beta
cal format, which are useful for further study. parameters. MeTPeak deploys a hierarchical beta-binomial
model to depict the variance of reads enrichment and an HMM
exomePeak R/Bioconductor package to account for the dependency of neighboring enrichment.
The ‘exomePeak’ is an open-source R package for detecting RNA MeTPeak is an open-source R package, where core heavy com-
methylation sites under a specific experimental condition or putation part of the algorithm is written in C þþ.
identifying the differential RNA methylation sites in a case-con-
trol study from MeRIP-Seq data (http://www.bioconductor.org/ txCoords
packages/release/bioc/html/exomePeak.html) [108]. Using txCoords is a novel and easy-to-use Web application for tran-
exomePeak R/Bioconductor package along with other software scriptomic peak remapping (http://www.bioinfo.tsinghua.edu.
programs for analysis of MeRIP-Seq data, it can conduct raw cn/txCoords) [112]. txCoords can be used to correct the incor-
reads alignment, RNA methylation site detection, motif discov- rectly reported transcriptomic peaks and retrieve the true
ery, differential RNA methylation analysis and functional sequences. It also supports visualization of the remapped peaks
analysis. in a schematic figure or from the UCSC Genome Browser.

meRanTK Annotation and visualization of RNA modification


The meRanTK is the first publicly available tool kit, which
addresses the special demands of high-throughput RNA cyto- CAn
sine methylation data analysis (http://icbi.at/software/ Hauenschild et al. [113] developed a RNA modification visualiza-
meRanTK/) [109]. It provides fast and easy-to-use splice-aware tion tool called CoverageAnalyzer (CAn), which allows the visu-
bisulfite sequencing read mapping, comprehensive methylation alization and assisted inspection of RNA-seq profiles for RT
calling and identification of differentially methylated cytosines signatures of modifications intuitively (https://sourceforge.net/
by statistical analysis of single- and multi-replicate experi- projects/coverageanalyzer/). CAn takes SAM input data files
ments. Application of meRanTK to RNA-BSseq or Aza-IP data from N user-specified samples as input file. Build in pipeline
will create Pileup format and further convert to Profile files,
produces accurate results in standard compliant
which provide pair-wise information including position; refer-
formats.meRanTK includes five multithreaded programs, which
ence base; coverage; mismatch rate M; number of (#) As, #Gs,
enable complete analysis and comparison of m5C transcriptome
#Ts and #Cs; and arrest rate A. Afterward, statistics are gathered
data sets. The tools, meRanT and meRanG, use well-established
for reference sequences including ID, file path, length,
RNAseq-specific short-read mappers as core aligning engines
sequence, coverage peak, number of high-arrest sites, high mis-
and extend them to facilitate mapping of either single- or
match sites, heterogeneous mismatch sites and mapped reads.
paired-end sequence reads from strand-specific RNA-BSseq
Based on this information, users can manually sort or set
libraries to a given reference sequence. The meRanCall methyl- threshold to filtering and visualize RNA modification RT signa-
ation caller uses aligned reads to precisely identify and statisti- ture on graphical user interface (GUI) software. CAn is highly
cally evaluate the positions of methylated cytosines. The conductive to the extraction of complete RT signatures, by pro-
experimental comparison tool meRanCompare is designed to viding full control of all thresholds for visualization, identifica-
detect differentially methylated m5Cs of two experimental con- tion and discrimination to the user.
ditions with single- or multi-replicate RNA methylation data
sets. The annotation tool meRanAnnotate helps to annotate MetaPlotR
candidate m5Cs with genomic features such as gene or MetaPlotR is a Perl and R pipeline, to easily generate metagenes
transcript names and positional metrics. for any organism for which a genome and transcript annotation

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
RNA methylation and diseases | 9

is available through the UCSC Genome Browser database error rate being called as a candidate modification. A more con-
(https://github.com/olarerin/metaPlotR) [114]. servative null hypothesis assumes only that the genotype is
biallelic. Taking this as the null hypothesis results in site with
RCAS three or more nucleotides that are sequenced at a rate higher
The RNA Centric Annotation System (RCAS) is an R package to than base-call errors being called as a candidate modification
ease the process of creating gene-centric annotations and anal- site. Besides, authors developed a kNN-based classifier to pre-
ysis for the genomic regions of interest obtained from various dict the modification type. Using small RNA-seq data, HAMR
RNA-based omics technologies (http://bioconductor.org/pack was able to detect 92% of all known human tRNA modification
ages/release/bioc/html/RCAS) [115]. The RCAS R package uses sites that are predicted to affect RT activity, and authors can
different R functions to perform annotation summarization, GO distinguish two classes of A and two classes of G modifications
term and gene set enrichment analysis and de novo sequence with 98 and 79% accuracy, respectively. However, HAMR cannot
motif discovery. The Web interface allows users to upload a sin- distinguish single-nucleotide polymorphism (SNP) and RNA
gle BED file, which is used as the main input to RCAS and to modification. Besides, HAMR is mainly built based on small
select analysis module. Users can select one of four reference
RNA-seq and tRNA modification data. Therefore, the perform-
genome assemblies and select one annotation database for the
ance of HAMR in RNA modification detection for other data
gene set enrichment analysis module. The intervals in the BED
need to be further validated.
file can optionally be down-sampled. On submission, the job is
enqueued to run RCAS in the background and generate the PPUS
specified HTML report. Once RCAS has generated the report, the W is known to be catalyzed by PUS, and is found to present in dif-
requester can access it online or download it in a bundle along ferent categories of noncoding RNAs such as tRNAs, rRNAs and
with any produced output files. snRNA. Li et al. [94] proposed a new platform called PPUS to iden-
tify PUS-specific W sites. A sliding window strategy is used to get
Computational models nucleotides around the Wsites as classification features. Then,
SVM classifier is followed to make the prediction of W sites.
RNA methylation has been found for decades of years, which However, PPUS can only identify W sites in human and S. cerevisiae.
occur at different RNA types of numerous species. As more and
more research evidences have indicated that RNA methylation iRNAMethyl
plays an important role in RNA splicing, posttranscriptional Chen et al. [93] proposed PseDNC to incorporate both the local
gene expression regulation, extensive translation of circRNA, and global sequence pattern information of the queried RNA
neuronal functions and many different stages of RNA life cycle
sequence, which is defined as:
[116, 117]. This reversible RNA methylation adds a new dimen-
sion to the developing picture of posttranscriptional regulation D ¼ ½d1 d2    d16 d16þ1    d16þk T (1)
of gene expression [118]. However, the experimental technolo-
gies are cost-ineffective for RNA methylation site prediction, with
RNA methylation function analysis directly. As complements to 8
> fu
experimental techniques, computational models could facilitate >
> 1  u  16
>
> X16 Xk
the analysis based on RNA sequences or RNA-seq data. Here, we >
>
>
> fi þ w hj
>
< i¼1
conclude the well-established methods for detecting potential j¼1
du ¼ (2)
RNA methylation sites. >
> whu16
>
> 16 < u  16 þ k
RNA methylation sites could be predicted based on powerful >
> X
16 Xk
>
>
>
> fi þ w hj
computational models in the following two ways. We could :
i¼1 j¼1
construct sequence-based models to predict potential RNA
methylation sites based on training samples (known methyla-
tion sites versus nonmethylation sites) and unlabeled samples where fu ðu ¼ 1; 2; . . . ; 16Þ is the normalized occurrence frequency
(genomic sequences or cDNA sequences). We can also predict of the u-th nonoverlapping dinucleotides. k is the number of the
the RNA methylation sites based on sequencing data, such as total counted ranks of the correlations along a RNA sequence,
RNA-seq, MeRIP-sequencing, BS-sequencing, etc. Then, we can while w is the weight factor. The correlation factor hj represents
make further analysis based on the predicted sites, such as dif- the j-tier structural correlation factor between all the most contig-
ferential analysis, annotation, visualization, as well as func- uous dinucleotides:
tional analysis, etc. Lj1
1 X
hj ¼ C ðj ¼ 1; 2; . . . ; k; k < LÞ; (3)
L  j  1 i¼1 i;iþj
Sequence-based site prediction models

HAMR With Hi;iþj being the coupling factor given by:


During RT, modifications may lead to RT signatures, including
1X v
RT arrestor mis-incorporation. RT signatures can manifest in Hi;iþj ¼ ½Pu ðDi Þ  Pu ðDiþj Þ2 ; (4)
v u¼1
the cDNA as either abortive or modification, respectively, which
can be captured by RNA-seq.
Ryvkin et al. [91] developed a statistical method to identify where v is the number of RNA physicochemical properties
RNA modification sites based on nucleotide mis-incorporation considered.
by RT. The method detects modification by two hypotheses. Finally, the feature vectors are fed into SVM for site
The simplest null hypothesis assumes the site is homozygous prediction. Through jackknife test, iRNA-Methyl shows
with the reference allele. Taking this as the null hypothesis better prediction performance than traditional BLAST
results in any nonreference nucleotide above the base-calling approach.

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
10 | Chen et al.

m6Apred j-tier structural correlation factor between all the j-th most con-
Chen et al. [92] developed a SVM-based computational model of tiguous dinucleotides Di ¼ Ri Riþ1 ,
m6Apred to identify m6A site in the S. cerevisiae transcriptome. To
the best of our knowledge, m6Apred is the first sequence-based 1
Ljkþ1
X
m6A site prediction model. m6Apred developed a sequence hj ¼ HðDi ; Diþ1 Þ ðj ¼ 1; 2; . . . ; k; k < LÞ; (9)
L  j  k þ 1 i¼1
encoding method to depict the nucleotide chemical properties as
well as the density information of each nucleotide in RNA with HðÞ being the correlation function. In term of motif fea-
sequences. The four different kinds of nucleotides found in RNA, ture, each sequence is represented as a boolean vector. If the
adenine (A), guanine (G), cytosine (C) and uracil (U), are classified substring selected as motif feature appears in one sequence, the
into three different groups in terms of chemical properties, such feature value is 1. Otherwise, the value is 0. GkmSVM is used in
as chemical structure, chemical binding and chemical functional- the next for gapped k-mer-based classification. Classification
ity, as shown in Table 4. Thus, each nucleotide Ni is defined by results of all the three classifiers vote for final prediction score,
three coordinates ðxi ; yi ; zi Þ with: as shown in Figure 1.
8 8
< 1 if Ni 2 fA; Gg <1 if Ni 2 fA; Cg
Table 4. Chemical property of nucleotide in RNA sequence
xi ¼ ; yi ¼ ;
: 0 if Ni 2 fC; Ug :
0 if Ni 2 fG; Ug Chemical property Class Nucleotides

Ring structure Purine A, G


Pyrimidine C, U
( Functional group Amino A, C
1 if Ni 2 fA; Ug
zi ¼ : (5) Keto G, U
0 if Ni 2 fC; Gg Hydrogen bond Strong C, G
Weak A, U

Then, the density di of any nucleotide si at position i in RNA


sequence is also included by:

(
1 X l 1 if sj ¼ q
di ¼ f ðsj Þ; f ðqÞ ¼ : (6)
jsi j j¼1 0 other cases

Finally, the encoded nucleotide chemical property and


nucleotide densities are fed into SVM for prediction. m6Apred
obtains an area under the curve (AUC) of 0.84 in the jackknife
test, showing the considerable accuracy in predicting m6A sites
in yeast. More importantly, m6Apred is not sensitive to the
selection of negative data, which are really difficult to obtain in
practical problems. However, whether m6Apred can be used to
predict mammalian m6A sites has not been tested.

RAM-ESVM
Chen et al. [103] developed an ensemble classifier, called RAM-
ESVM, for detecting m6A sites in the S. cerevisiae genome. RAM-
ESVM used PseDNC together with SVM (SVM-PseKNC), motif fea-
tures together with SVM (SVM-motif) and GkmSVM as basic classi-
fiers. PreDNC represents the RNA sequences, which is defined as:

D ¼ ½d1 d2    d16 d16þ1    d16þk T (7)

with

8
> fk
>
> 1  u  16
>
> X16 Xk
>
>
>
> fi þ w hj
>
< i¼1 j¼1
du ¼ ; (8)
>
> whu16
>
> 16 < u  16 þ k
>
> X
16 Xk
>
>
>
> fi þ w hj
:
i¼1 j¼1

where fk ðk ¼ 1; 2; . . . ; 16Þ is the normalized occurrence frequency


of the nonoverlapping dinucleotides. k is the number of the total Figure 1. The flowchart of RAM-ESVM which have described the basic steps to
counted ranks of the correlations along a RNA sequence, while predict m6A methylation site from S. cerevisiae transcriptome. It uses ensemble
SVM classifiers as well as some novel sequence features.
w is the weight factor. The correlation factor hj represents the

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
RNA methylation and diseases | 11

RNAMethPre respectively. In term of third type of feature, the frequency of a


Considering different input sequences, i.e. genomic sequences spaced nucleotide pair npi is defined as:
and cDNA sequences, Xiang et al. [96] developed an SVM-based
model to predict m6A sites in human, mouse and mammal. The Cðnpi Þ
Frequencyðnpi Þ ¼ ; (11)
frame structure is shown in Figure 2. Features, such as nucleo- Wd1
tide sequence position, nucleotide k-mer frequency, a relative
position value calculated from the absolute distance from the
transcript start site as well as the stability of the local structure, Where Cðnpi Þ is the count of npi inside a flanking window,
are combined and added to SVM classifier for predicting the W is the window size and d is the space between two nucleoti-
m6A sites. Just like SRAMP, RNAMethPre provides the full tran- des. SRAMP also considers secondary structure predicted
by RNAfold as the classification feature. The secondary
script mode and mature mRNA mode. For performance
structures are classified into hairpin loop, multiple loop,
enhancement, RNAMethPre integrates all the abovementioned
interior loop, paired and bulged loop, which are encoded as
four features for mature mRNA mode, but only include the for-
binary vectors, respectively. Random Forest classifiers are
mer two features in the full transcript mode. The validation
then trained with each feature. Finally, the prediction scores
results show that the performance of RNAMethyPre is superior
of the Random Forest classifiers trained with different feature
to that of SRAMP.
encodings were combined using the weighted summing for-
mula shown below:
SRAMP
Zhou et al. [99] established a mammalian m6A sites predictor
X
n
named SRAMP (sequence-based RNA adenosine methylation Scombined ¼ ai Si : (12)
site predictor) under the Random Forest framework, which is i¼1

shown in Figure 3. SRAMP considers the positional binary encod-


ing of nucleotide sequence, the kNN encoding and the nucleotide Where the Si and ai are the prediction score and the weight
pair spectrum encoding. In the positional binary encoding, four for the classifier trained with the i-th encoding, respectively. n
is the total number of classifiers taken into account. The over-
different kinds of nucleotides found in RNA, adenine (A), guanine
all AUROC for the full transcript mode from 5-fold cross valida-
(G), cytosine (C) and uracil (U), are translated as binary vectors of
tion (CV) is 0.891, showing that it achieves good performance
(1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0) and (0, 0, 0, 1). Then, kNN encoding
in full transcript mode. However, the prediction performance
depicts how much the 21 nt flanking window of one query sam-
in mature mRNA mode can be further improved.
ple resembles those of other m6A sites. The flanking window of
the query sample was first compared with all reference samples
RNAMethylPred
to obtain pair-wise similarity scores:
Jia et al. [119] proposed a new bioinformatics model, named
RNAMethylPred for the large-scale, rapid identification
X
W
Pair  wise similarity ¼ NUC44ðqi ; ri Þ; (10) of m6A site. It was developed by incorporating Bi-profile Bayes
i¼1 (BPB), dinucleotide composition (DNC) and kNN scores as
selected features, deploying SVM as classifier to perform the
where qi and ri are the nucleotides at the ith position of the predictions, shown in Figure 4. First, with BPB, the queried
query sample and the reference sample‘s flanking windows, sequence s is encoded into a probability vector

Figure 2. The flowchart shows the basic idea of RNAMethPre, which is used to predict m6A sites in human, mouse and mammal. RNAMethPre uses multiple features of
mRNA, such as flanking sequences, local secondary structure information and relative position information, and feeds them into a SVM classifier for m6A site
prediction.

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
12 | Chen et al.

Figure 3. The flowchart shows the basic steps of SRAMP. SRAMP combines three Random Forest classifiers that exploit the positional nucleotide sequence pattern, the
kNN information and the position-independent nucleotide pair spectrum features, respectively.

Figure 4. The flowchart of RNAMethylPred, which have described the basic steps to identify m6A site. RNAMethylPred adopts BPB, dinucleotides composition and kNN
scores for feature extractions, and then follows SVM for classification.

V ¼ ðp1 ; p2 ; . . . ; pn ; pnþ1 ; . . . ; p2n Þ, where pi ði ¼ 1; 2; . . . ; nÞ denotes Where ab stands for the adjoining dinucleotides, Nab stands
the posterior probability of each nucleic acid at i-th position in for the number of the adjoining dinucleotides in an RNA segment
the positive samples, and pi ði ¼ n þ 1; n þ 2; . . . ; 2nÞ denotes the sample, a• stands for the adjoining dinucleotides, • stands for
posterior probability of each nucleic acid at the i-th position in any nucleotide and n is the length of RNA sample. Thus, DNC
the negative samples, with n being the length of queried encoding can contain features from both Pab and P0ab . In the next,
sequences. Then, DNC were defined as: the kNNs of the queried sequence in both positive and negative
sets are picked out according to RNA local sequence similarity to
Nab get the kNN score, which can be formulated by following:
Pab ¼
Na•
(13) X
Nab SðA; BÞ ¼ ScoreðA½i; B½iÞ; (14)
P0 ab ¼ ;
n1 1in

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
RNA methylation and diseases | 13

iRNA-PseColl
Where A½i and B½i represent for the nucleotide at position i
Feng et al. [106] developed a new platform called iRNA-PseColl
in both RNA sequence fragments. The similarity score for two
to identify the occurrence sites for several different types of
nucleotides a and b is defined as:
RNA modifications. iRNA-PseColl is the first method designed
 for multiple RNA modifications. iRNA-PseColl incorporates both
þ2; if a ¼ b
Score ¼ : (15) individual and collective features of the sequence elements as
1; others
features, as shown in Figure 5. Local features of the ith nucleoti-
des Ni ¼ ðxi ; yi ; zi Þ are the same as that defined in m6Apred
The kNN score is achieved by calculating the percentage of
method. The occurrence frequency of a nucleotide for its distri-
the positive neighbors in its kNNs. As a result, RNAMethylPred
bution along the sequence is defined as:
achieves an accuracy of 76.51% and an MCC of 0.5302, showing
the considerable performance in predicting m6A sites.
1 X ‘
Di ¼ f ðNj Þ; (16)
jjLi jj j¼1
iRNA-PseU
Chen et al. [101] developed a new predictor called iRNA-PseU to
identify W sites. iRNA-PseU follows iRNA-PseColl to encode the
Where Di is the density of the nucleotide Ni at the site i of a
nucleotide chemical property and nucleotide density, which are
RNA sequence, jjLi jj the length of the sliding substring con-
deployed as classification features in the following SVM classi-
cerned, ‘ denotes each of the site locations counted in the sub-
fier to make the prediction of W sites. iRNA-PseU shows to
string and
achieve considerable accuracy in jackknife test on the bench-
mark data sets of human, mice and S. cerevisiae. 
1; if Nj ¼ the nucleotide concerned
f ðNj Þ ¼ : (17)
0; otherwise
RAMPred
m1A has been found to have major influences on the structure Then, we combine the local feature and collective feature for
and function of tRNA and rRNA. Chen et al. [100] further devel- the i-th nucleotide, which is defined by a set of four variables:
oped a new platform called RAMPred to identify the occurrence
sites of m1A modifications across species of human, mice and S. Ni ¼ ðxi ; yi ; zi ; Di Þ: (18)
cerevisiae transcriptome. RAMPred uses almost the same structure
of m6Apred, and the validation results show that RAMPred can Finally, the prediction is achieved by SVM based on the fea-
get satisfactory performance in predicting m1A modifications. tures defined in (20).

Figure 5. The flowchart of iRNA-PseColl. It aims at identification of occurrence sites for multiple RNA modification. It incorporates both local and collective features for
classification, and SVM is adopted as the final classifier.

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
14 | Chen et al.

Figure 6. The workflow of RAM-NPPS. It is based on multi-interval NPPS for feature extraction, and SVM classifier for prediction of m6A sites within RNA sequences.

Validation results show that iRNA-PseColl can get considerable


performance in predicting different types of RNA modifications.

RAM-NPPS
Xing et al. [104] proposed a sequence-based predictor called
RAM-NPPS for identifying m6A sites within RNA sequences of
different species. RAM-NPPS first encodes the input sequences
by nucleotide pair position specificity (NPPS) algorithm. Then,
the resulting feature vectors are joined together as the input for
the following SVM classifier to make prediction. Figure 6 shows
the NPPS feature encoding process. For the queried RNA
sequence, it can be encoded by:

P ¼ Pþ  P ; (19)

Where Pþ is formulated as:

Pþ ¼ pþ þ þ þ þ
1 p2    pk    pl1 pl ; (20)
Figure 7. The basic steps of exomePeak, which provides a general pipeline for
with pk represents the k-th nucleotide, and l is the length of the MeRIP-Seq data processing, predict m6A sites for samples from single condition,
sequence. or identify differentially methylated sites for samples from multiple conditions.

Two matrices Tsþ and Tdþ are defined to calculate pþ þ


k . Ts has
size of 4  l representing the single-nucleotide occurrence prob- and thus avoid the transcriptome heterogeneity. The test
ability. Rows represent fA; C; G; Ug, respectively, and columns results show that it can achieve fairly robust m6A peak detec-
represent the length of the sequence. Tdþ has size of 16  l repre- tion. However, it has two major limitations. First, exomePeak
senting the occurrence probability of nucleotide pair, with rows does not model the reads variance within transcripts and across
representing fA; C; G; Ug  fA; C; G; Ug. Suppose the dinucleotide replicates. Second, exomePeak ignores the dependency of reads
between the k-th nucleotide and (k þ n)-th nucleotide is enrichment, and thus may miss the true peaks with low enrich-
PðA\BÞ Fþ
‘AB’, then pþ ab;k
k ¼ PðBÞ ¼ f þ , where ab is the index of ‘AB’ in
ment. Finally, exomePeak tries to call peaks by bin-based
b;kþn
method, which makes it difficult to get close to base solution.
fA; C; G; Ug  fA; C; G; Ug, and b is the index of ‘B’ in the fA; C; G; Ug.
The evaluation on three data sets shows that RAM-UPPS is
MeTPeak
effective and robust for the identification of m6A sites cross-dif-
Cui et al. [111] further developed a graphic model-based peak-
ferent species. However, running of RAM-NPPS results in really
calling algorithm, MeTPeak, to detect m6A sites from MeRIP-seq
heavy computation load.
data. It detects m6A peaks on each gene separately by dividing
the particular gene into N bins with length of sequencing frag-
NGS data-based site detection models
ment L. The mixture of beta-binomial distribution is set up to
exomePeak describe the reads count in each bin, while an HMM is adopted
Meng et.al. 120] proposed a pipeline for the analysis of MeRIP- to depict the reads dependency between the continuous bins.
seq data by combining several existing tools with a novel To be more specific, the reads count in the m-th pair of IP and
exome-based peak-calling and differential analysis approach, input samples in bin n are denoted as Xmn and Ymn , which both
shown in Figure 7. exomePeak first extracts and connects all the follow Poisson distribution with parameters SIP;m , Sinput;m , kIP;m
exons of a specific gene and then detects peaks using a sliding and kinput;m . SIP;m and Sinput;m are total reads in the m-th IP and
window with C-test to determine the methylation site. Thus, it input samples, respectively, and kIP;m and kinput;m are the normal-
can be considered as projecting transcriptome onto the genome, ized Poisson rates. With a priori of beta distribution for pn ,

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
RNA methylation and diseases | 15

which represents the methylation percentage at n-th bin, Xmn inputs. Then, sequencing adapter contamination and poor-
follows the beta-binomial distribution: quality bases are removed by Cutadapt software, followed by
sequence alignment to reference genomes. The resulting align-
Y
2
CðXmn þ ak ÞCðYmn þ bk ÞCðak þ bk Þ ments are sorted and indexed using SAMtools to obtain BAM
PðXmn jZn ; a; bÞ ¼ ðC Þ; (21) files subsequent peak-calling module. For the preprocess step,
k¼1
CðTmn þ ak þ bk ÞCðak ÞCðbk Þ
the fragment coverage distribution is counted to compensate
for the bias of reads coverage distribution; thus, it can achieve
where Zn 2 ½1; 2 denotes the unknown hidden methylation sta-
precise determination of sequenced RNA fragment coverage.
tus with 1 representing methylated and 2 otherwise.
Then, base-level candidate peak position is identified with local
mn þ1Þ
Tmn ¼ Xmn þ Ymn . C ¼ CðXmnCðT
þ1ÞCðYmn þ1Þ is the normalization con- maxima as well as Fisher’s exact test. High-confidence peaks
stant. a ¼ ½a1 a2 T , b ¼ ½b1 b2 T are the unknown parameters in the are called by combination of P-value and FDR-based cutoffs, as
model, and they are shared for all bins across replications, and well as IP enrichment and c overage filters. In terms of the situa-
thus can somehow depict the variance of reads count across tion that peaks arising from multiple m6A sites in close proxim-
replicates. Then, an iterative Expectation Maximization (EM) ity, which are visually indistinguishable cannot be identified
algorithm is conducted to predict methylation sites Zn and accurately, m6aViewer proposed a mixture distribution-based
model parameters. As a result, MeTPeak is shown to be more approach to deconvolute overlapping peaks and pinpoint m6A
robust against data variance, small replicates and data outlier, methylation sites with increased precision. It considers the
and is more sensitive to lowly enriched peaks than exomePeak. fragment coverage distribution in an enriched region as a
However, limitations still existed in MeTPeak. For example, the mixture of coverage distributions. EM algorithm is adopted to
search of m6A site is only limited to annotated genes. MeTPeak establish the combination of mixtures best depict the observed
is also based on bin method, which makes it hard to get base RNA fragment distribution. To avoid overfitting, Bayesian
solution. information criterion is used to account for both the likelihood
and the model complexity. M6aViewer combined the above
meRanTK sequence-based model with a feature-based approach to
Rieder et al. [109] developed a tool kit for the analysis of RNA- achieve practicable precision and recall rates. Features
BSseq or Aza-IP data to detect m5C methylation site. To our including transcript information, sequence composition,
knowledge, it is the first specialized software for RNA-BSseq sequencing data features surrounding the peak or conserva-
data analysis. meRanTK includes five multithreaded programs, tion information are obtained and fed into the subsequent
such as meRanT, meRanG, meRanCall, meRanCompare and Random Forest classifiers, whose results vote for the final
meRanAnnotate, as shown in Figure 8. Among these tools, classification score. Validation on multiple published m6A-seq
both meRanT and meRanG are RNA-BSseq alignment tools. data sets shows that m6aViewer can identify high-confidence
The difference between these two alignment tools is that the methylated residues with more precision than other current
former one maps reads to a preassembled set of transcripts, existed approaches.
while the later one maps to a bisulfite-converted genome.
meRanCall then extracts the methylation state of individual
Discussion and conclusion
cytosine from the previous alignment. Based on the detected
m5C sites, meRanCompare can help to identify whether RNA is the intermediate molecule between DNA and proteins in
the site is differentially methylation in two experimental the chain that links genetic information contained in genes to
conditions, while meRanAnnotate can assign genomic annota- its expression in functional proteins, by either carrying this
tions and distance measurements to each individual candidate information in the form of mRNA or participating in mRNA
m5Cs. expression, splicing, stability and translation in the form of
noncoding RNAs. RNA methylation is a reversible posttransla-
m6aViewer tional modification to RNA that adds a new dimension to the
Antanaviciute et al. [121] developed a cross-platform, developing picture of gene expression regulation. It has been
m6AViewer, for the detection, analysis and visualization of m6A known to play critical roles in multiple biological processes by
peaks from MeRIP-seq data. The workflow is shown in Figure 9, the advances in RNA detection and sequencing technologies.
where sorted and index BAM files are fed into m6aViewer as However, the sequencing protocol of RNA methylation is highly
different from previous sequencing technologies. Although

Figure 8. The workflow of meRanTK, which is specialized for RNA-BSseq data Figure 9. The basic steps of m6aViewer, which can identify high-confidence
mapping, alignment, m5C site detection and annotation. methylated residues more precisely. It also provides a GUI for convenience.

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
16 | Chen et al.

some databases, Web server and software as well as computa- which is proved to be more time and cost-effective than biologi-
tional models have been established for RNA methylation, the cal methods.
function of most RNA methylations and their alterations in bio- In this article, we summarized the known types of RNA
logical processes and human diseases are largely unknown for methylation, the biological functions of RNA methylation, five
the lack of effective and precise processing methods [122]. RNA methylation-related diseases, seven publicly available RNA
A significant proportion of m6A methylation sites are methylation-related databases, RNA methylation annotation,
enriched in the 50 UTR, around stop codon and the proximal visualization tools, etc. Then, we introduced some state-of-art
region of 30 UTR of transcripts, while miRNA-targeted sites at the Web server and software as well as computational models
50 end and 30 end of 30 UTR suggested a potential link between for RNA methylation site prediction as well as differential
m6A methylation and miRNA targeting sites, thus may regulate analysis (Table 5). Most Web servers aim at identifying methyl-
miRNA-related pathways [123–125]. To be more specific, miRNA ation sites based on sequence, and most Web servers are
miR-145 has been suggested by bioinformatics analysis that it constructed following the five-step guidelines: (1) How to con-
might target the 30 UTR region of YTHDF2 MRNA, which is an struct a valid benchmark data set to train and test the subse-
m6A reader protein helping to recognize mRNA m6A sites to quent predictor? (2) How to represent the biological sequence
mediate mRNA degradation [140]. Another research showed with an valid mathematical formulation? (3) How to develop
that manipulation of miRNA expression or sequences altered an effective prediction algorithm? (4) How to conduct the
m6A modification levels through modulating the binding of cross-validation to objectively estimate the performance of
METTL3 methyltransferase to mRNA containing miRNA target- the predictor? (5) How to establish a user-friendly interface?
ing sites, thus regulating m6A formation of mRNAs [142]. A These Web servers first build training data and testing data
knockdown of m6A demethylase FTO has been found to affect referring to RNA methylation motif DRACH (where D ¼ A, G or
the steady-state levels of certain miRNAs [145]. Therefore, the U; R ¼ A or G; H ¼ A, C or U). Then, they follow different feature
prediction of associations between RNA methylation and encoding scheme for discriminable features and fed them into
miRNAs has great interest in its biogenesis and other fields. It
classifiers for RNA methylation site prediction. Most
will help better understand the complex regulation effect of
sequence-based methods used SVM or SVM-based model as
both miRNA and RNA methylation. Furthermore, most studies
classifier. Some models, such as SRAMP adopted Random
focused on the association between m6A methylation and
Forest and kNN for site prediction. With more discriminable
miRNAs. It can be further expanded to the other types of RNA
features discovered in the future, prediction accuracy could be
methylation as well.
further improved.
Aberrant m6A modification patterns have been linked to
Furthermore, some high-throughput sequencing data-based
diverse human diseases, including infertility, various forms of
computational models, such as exomePeak, MeTPeak,
cancer, obesity, diabetes, depression and neurodevelopmental
m6aViewer, are developed to predict RNA methylation site,
disorders, etc. For example, the m6A hyper methylation of IDH1/
which can be further extended for differential analysis.
2 is found to play an important role in tumor progression, which
exomePeak and MeTPeak are based on R environment, which
can finally result in acute myeloid leukemia [76]. The m6A hypo
rely on a bin-based method. exomePeak predicts a methylation
methylation of RUNX1T1 is also found to participate in the dis-
site by testing the small, equally sized regions divided from the
ease progression in obesity. Thus, different methylation status
transcriptome. If the number of reads of an exact bin in the
in different genes may result in different phenotype. But it is
immunoprecipitated sample is higher than that in the input
really difficult and expensive to find the pathology of different
sample, the bin is predicted as a significantly enriched bin.
types of RNA methylation in different diseases from experimen-
Then, those consecutive significantly enriched bins are merged
tal aspect only. Therefore, the prediction of RNA methylation–
together to form a larger region as peak. MeTPeak further model
disease association, which can strongly guide the biological
experiments, is of great significance in biological, medical and those consecutive significantly enriched bins by an HMM model
other fields. Based on network or machine learning models, the to form a peak. However, significantly enriched regions pre-
association probability between RNA methylation sites and dis- dicted by the above two methods can span a large range, which
eases could be quantified and RNA methylation site–disease can only achieve rough site identification.
pairs with high confidence could be selected for further biologi- M6aViewer is developed on Java. It aims at detecting high-
cal experimental validation. Thus, it will help understand the confidence m6A peaks by an EM-based deconvolution method,
biogenesis, regulation and function of RNA methylation and which helps to pinpoint m6A methylation sites with better pre-
human disease molecular mechanism at transcriptomic level, cision. M6aViewer also combined the sequencing data-based
discover biomarkers and drugs for human disease diagnosis, computational models with sequence-based predictor to
treatment, prognosis and prevention with less time and cost of enhance the performance of false-positive rate. A supervised
biological experiments. ensemble learning classifier is built to distinguish true-positive
The first step to predict associations between RNA methyla- m6A sites from false-positive peaks. It is worth noting that the
tion and diseases or miRNAs is to identify the RNA methylation integration of sequence-based computational models with
site precisely. Then, some new databases that annotate RNA sequencing data-based models could help to further improve
methylation sequences provide the comprehensive information the accuracy of site prediction.
of RNA methylation sites or display and collect the experimen- Nowadays, a wide range of databases, Web servers about
tally confirmed RNA methylation–disease associations should methylation site prediction have been built, providing a variety
be built for further analysis. To be more specific, some network- of methods for RNA methylation data processing. However,
based models or machine learning models can be built to pre- most methods focus on m6A methylation. It is expected to build
dict the association between diseases and those predicted sites such resources for the other types of methylation. Furthermore,
based on those databases. Recently, scientists focused on build- RNA methylation–miRNA association, RNA methylation–dis-
ing computational models to predict RNA methylation sites ease association databases, Web servers and computational
based on either sequence or high-throughput sequencing data, models should be constructed in the near future, which would

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
by guest
Table 5. Comparison list of the databases, Web servers and software

on 19 November 2017
Name Type Year of first RNA methylation Objection Species Maintained Link
version type

Databases
RNAMDB Database 2011 N/A A database for RNA N/A No http://rna-mdb.cas.albany.edu/RNAmods/
modifications
MODOMICS Database 2013 N/A A database of RNA N/A Yes http://modomics.genesilico.pl
modifications
RADAR Database 2014 A to I Collection of A-to-I RNA Homo sapiens, Yes http://RNAedit.com
editing sites with M. musculus,
annotation D. melanogaster
MeT-DB Database 2015 m6A A database for publicaly N/A Yes http://compgenomics.utsa.edu/methylation/
available m6A data sets
RMBase Database 2015 m6A, m5C, A database for RNA N/A System http://mirlab.sysu.edu.cn/rmbase/
2’-O- modifications maintaining
methylation,
W
REDIportal Database 2016 A to I Collection of A-to-I RNA N/A Yes http://srv00.recas.ba.infn.it/atlas/
editing sites with
annotation
Sequence-based site prediction tools
HAMR Web server 2013 N/A To predict RNA modifica- N/A Yes http://wanglab.pcbi.upenn.edu/hamr
tion site (location and
methylation class)
M6Apred Web server 2015 m6A To predict m6A site Saccharomyces Yes http://lin.uestc.edu.cn/server/m6Apred.php
cerevisiae
iRNA-Methyl Web server 2015 m6A To predict m6A site N/A Yes http://lin.uestc.edu.cn/server/iRNA-Methyl
PPUS Web server 2015 W To predict W site N/A Yes http://lyh.pkmu.cn/ppus/

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


AthMethPre Web server 2016 m6A To predict m6A site Arabidopsis thaliana No http://bioinfo.tsinghua.edu.cn/AthMethPre/index.html
RNAMethPre Web server 2016 m6A To predict m6A site Homo sapiens, No http://bioinfo.tsinghua.edu.cn/RNAMethPre/index.html
M. musculus
m6ATH Web server 2016 m6A To predict m6A site Arabidopsis thaliana Yes http://lin.uestc.edu.cn/server/M6ATH
PRNAm-PC Web server 2016 m6A To predict m6A site N/A Yes http://www.jci-bioinfo.cn/pRNAm-PC
SRAMP Web server 2016 m6A To predict m6A site N/A Yes http://www.cuilab.cn/sramp/
RAMPred Web server 2016 m1A To predict m1A site N/A Yes http://lin.uestc.edu.cn/server/RAMPred
iRNA-PseU Web server 2016 W To predict W site N/A Yes http://lin.uestc.edu.cn/server/iRNA-PseU
MethyRNA Web server 2016 m6A To predict m6A site N/A Yes http://lin.uestc.edu.cn/server/methyrna
RAM-ESVM Web server 2017 m6A To predict m6A site Saccharomyces Yes http://server.malab.cn/RAM-ESVM/
cerevisiae
RAM-NPPS Web server 2017 m6A To predict m6A site N/A Yes http://server.malab.cn/RAM-NPPS/
iRNA-AI Web server 2017 A to I To predict A-to-I site N/A Yes http://lin.uestc.edu.cn/server/iRNA-AI/
iRNA-PseColl Web server 2017 m1A, m6A, m5C To predict the occurrence N/A Yes http://lin.uestc.edu.cn/server/iRNA-PseColl
sites of RNA
RNA methylation and diseases

modifications
|

NGS data-based site detection tools


MeRIP-PF Software 2013 m6A To predict m6A-modified N/A Yes http://software.big.ac.cn/MeRIP-PF.html
17

peaks

Continued
18 | Chen et al.

benefit biologists to experimentally unveil the functions, mech-

http://bioconductor.org/packages/release/bioc/html/RCAS
anisms of RNA methylation.

http://www.bioconductor.org/packages/release/bioc

https://sourceforge.net/projects/coverageanalyzer/
Key Points

http://www.bioinfo.tsinghua.edu.cn/txCoords
https://github.com/compgenomics/MeTPeak
• We made a brief introduction of the functions of RNA
methylation, eight types of most popular RNA methyla-

https://github.com/olarerin/metaPlotR
tion, seven publicly available RNA methylation-related

https://github.com/lzcyzm/DRME
http://icbi.at/software/meRanTK/

databases some important publicly available RNA-


methylation-related Web server, software and computa-
tional tools for RNA methylation site identification,
/html/exomePeak.html

differential analysis and so on.


• Developing effective computational models to precisely
identify methylation sites based on sequence or
sequencing data could benefit better understanding of
complex functions of RNA methylation.
• Making full use of different types of data sources, such
Link

as sequencing data with different technologies, sequen-


ces, etc., could benefit more effective discovery of new
RNA methylation functions.
Maintained

• RNA methylation–miRNA regulatory patterns could be


predicted based on powerful computational models.
• Complex RNA methylation–disease associations could
Yes

Yes

Yes
Yes

Yes

Yes
Yes
No

be predicted based on powerful computational models.

Funding
Fundamental Research Funds for the Central Universities
Species

(2017XKQY083 to X.C.).
N/A

N/A

N/A
N/A

N/A

Visualization and analysis N/A

N/A
N/A
motif discovery, differen-

of modification signature

References
in deep-sequencing data
Alignment, site detection,

Alignment, site detection,

To predict m6A-modified

To predict m6A-modified
tial analysis based on

based on BS-seq data

1. Thauer RK. Biochemistry of methanogenesis: a tribute to


differential analysis

Create metagene plot


Differential analysis

Marjory Stephenson. 1998 Marjory Stephenson Prize


MeRIP-Seq data

Lecture. Microbiology 1998;144(9):2377–406.


2. Jones PA. Functions of DNA methylation: islands, start sites,
Annotation

gene bodies and beyond. Nat Rev Genet 2012;13(7):484–92.


Year of first RNA methylation Objection

peaks

peaks

3. Chi P, Allis CD, Wang GG. Covalent histone modifications–


miswritten, misinterpreted and mis-erased in human can-
cers. Nat Rev Cancer 2010;10(7):457–69.
4. Rana AK, Ankri S. Reviving the RNA world: an insight into the
appearance of RNA methyltransferases. Front Genet 2016;7:99.
5. Machnicka MA, Milanowska K, Osman Oglou O, et al.
MODOMICS: a database of RNA modification pathways–2013
m6A

m6A

m6A
type

update. Nucleic Acids Res 2013;41:D262–7.


N/A

N/A

N/A

N/A
N/A

6. Cantara WA, Crain PF, Rozenski J, et al. The RNA modifica-


tion database, RNAMDB: 2011 update. Nucleic Acids Res 2011;
39:D195–201.
version

2013

2016

2016
2016

Web server 2016

2016

2017
2017

7. Motorin Y, Helm M. RNA nucleotide methylation. Wiley


Interdiscip Rev RNA 2011;2(5):611–31.
8. Desrosiers R, Friderici K, Rottman F. Identification of methy-
exomePeak R/Bioconductor Software

Software

Software
Software

Software

Software
Software

lated nucleosides in messenger RNA from Novikoff hepa-


Type

toma cells. Proc Natl Acad Sci USA 1974;71(10):3971–5.


9. Tuck MT. Partial purification of a 6-methyladenine mRNA
Annotation and visualization
of RNA modification tools

methyltransferase which modifies internal adenine resi-


dues. Biochem J 1992;288(1):233–40.
10. Bokar JA, Shambaugh ME, Polayes D, et al. Purification and
Table 5. (continued)

cDNA cloning of the AdoMet-binding subunit of the human


mRNA (N6-adenosine)-methyltransferase. RNA 1997;3:
1233–47.
package

MetaPlotR
meRanTK

MeTPeak

txCoords

11. Yue Y, Liu J, He C. RNA N6-methyladenosine methylation in


Name

DRME

RCAS

post-transcriptional gene expression regulation. Genes Dev


CAn

2015;29(13):1343–55.

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
RNA methylation and diseases | 19

12. Chandola U, Das R, Panda B. Role of the N6- 32. Li JB, Levanon EY, Yoon JK, et al. Genome-wide identification
methyladenosine RNA mark in gene regulation and its of human RNA editing sites by parallel DNA capturing and
implications on development and disease. Brief Funct sequencing. Science 2009;324(5931):1210–3.
Genomics 2015;14(3):169–79. 33. Levanon EY, Eisenberg E, Yelin R, et al. Systematic identifica-
13. Lee M, Kim B, Kim VN. Emerging roles of RNA modification: tion of abundant A-to-I editing sites in the human transcrip-
m(6)A and U-tail. Cell 2014;158(5):980–7. tome. Nat Biotechnol 2004;22(8):1001–5.
14. Squires JE, Patel HR, Nousch M, et al. Widespread occurrence 34. Vitali P, Scadden AD. Double-stranded RNAs containing
of 5-methylcytosine in human coding and non-coding RNA. multiple IU pairs are sufficient to suppress interferon induc-
Nucleic Acids Res 2012;40(11):5023–33. tion and apoptosis. Nat Struct Mol Biol 2010;17(99):1043–50.
15. Hussain S, Sajini AA, Blanco S, et al. NSun2-mediated 35. Mannion NM, Greenwood SM, Young R, et al. The RNA-
cytosine-5 methylation of vault noncoding RNA determines editing enzyme ADAR1 controls innate immune responses
its processing into regulatory small RNAs. Cell Rep 2013;4(2): to RNA. Cell Rep 2014;9(4):1482–94.
255–61. 36. Liu N, Pan T. RNA epigenetics. Transl Res 2015;165(1):28–35.
16. Khoddami V, Cairns BR. Identification of direct targets and 37. Blanco S, Frye M. Role of RNA methyltransferases in tissue
modified bases of RNA cytosine methyltransferases. Nat renewal and pathology. Curr Opin Cell Biol 2014;31:1–7.
Biotechnol 2013;31(5):458–64. 38. Liu N, Dai Q, Zheng G, et al. N(6)-methyladenosine-depend-
17. al-Arif A, Sporn MB. 2’-O-methylation of adenosine, guano- ent RNA structural switches regulate RNA-protein interac-
sine, uridine, and cytidine in RNA of isolated rat liver nuclei. tions. Nature 2015;518(7540):560–4.
Proc Natl Acad Sci USA 1972;69(7):1716–9. 39. Geula S, Moshitch-Moshkovitz S, Dominissini D, et al. Stem
18. Feder M, Pas J, Wyrwicz LS, et al. Molecular phylogenetics of cells. m6A mRNA methylation facilitates resolution of naive
the RrmJ/fibrillarin superfamily of ribose 2’-O-methyltrans- pluripotency toward differentiation. Science 2015;347(6225):
ferases. Gene 2003;302(1-2):129–38. 1002–6.
19. Daffis S, Szretter KJ, Schriewer J, et al. 2’-O methylation of the 40. Fustin J-M, Doi M, Yamaguchi Y, et al. RNA-methylation-
dependent RNA processing controls the speed of the circa-
viral mRNA cap evades host restriction by IFIT family mem-
dian clock. Cell 2013;155(4):793–806.
bers. Nature 2010;468(7322):452–6.
41. Xiang Y, Laurent B, Hsu CH, et al. RNA m6A methylation reg-
20. Li J, Yang Z, Yu B, et al. Methylation protects miRNAs and
ulates the ultraviolet-induced DNA damage response.
siRNAs from a 3’-end uridylation activity in Arabidopsis.
Nature 2017;543(7646):573–6.
Curr Biol 2005;15(16):1501–7.
42. Haussmann IU, Bodi Z, Sanchez-Moran E, et al. m6A potenti-
21. Hu G, Tsai AL, Quiocho FA. Insertion of an N7-
ates Sxl alternative pre-mRNA splicing for robust Drosophila
methylguanine mRNA cap between two coplanar aromatic
sex determination. Nature 2016;540(7632):301–4.
residues of a cap-binding protein is fast and selective for a
43. Kennedy EM, Bogerd HP, Kornepati AV, et al.
positively charged cap. J Biol Chem 2003;278(51):51515–20.
Posttranscriptional m(6)A editing of HIV-1 mRNAs enhances
22. Dominissini D, Nachtergaele S, Moshitch-Moshkovitz S,
viral gene expression. Cell Host Microbe 2016;19(5):675–85.
et al. The dynamic N(1)-methyladenosine methylome in
44. Zhou J, Wan J, Gao X, et al. Dynamic m(6)A mRNA methyla-
eukaryotic messenger RNA. Nature 2016;530(7591):441–6.
tion directs translational control of heat shock response.
23. Davis FF, Allen FW. Ribonucleic acids from yeast which con-
Nature 2015;526(7574):591–4.
tain a fifth nucleotide. J Biol Chem 1957;227(2):907–15.
45. Schaefer M, Pollex T, Hanna K, et al. RNA methylation by
24. Karijolich J, Yu YT. Converting nonsense codons into sense
Dnmt2 protects transfer RNAs against stress-induced cleav-
codons by targeted pseudouridylation. Nature 2011;
age. Genes Dev 2010;24(15):1590–5.
474(7351):395–8.
46. Topisirovic I, Svitkin YV, Sonenberg N, et al. Cap and cap-
25. Schwartz S, Bernstein DA, Mumbach MR, et al.
binding proteins in the control of gene expression. Wiley
Transcriptome-wide mapping reveals widespread dynamic- Interdiscip Rev RNA 2011;2(2):277–98.
regulated pseudouridylation of ncRNA and mRNA. Cell 2014; 47. Cabili MN, Trapnell C, Goff L, et al. Integrative annotation of
159:148–62. human large intergenic noncoding RNAs reveals global
26. Carlile TM, Rojas-Duran MF, Zinshteyn B, et al. properties and specific subclasses. Genes Dev 2011;25(18):
Pseudouridine profiling reveals regulated mRNA pseudouri- 1915–27.
dylation in yeast and human cells. Nature 2014;515(7525): 48. Canaani D, Kahana C, Lavi S, et al. Identification and map-
143–6. ping of N6-methyladenosine containing sequences in sim-
27. Globisch D, Munzel M, Muller M, et al. Tissue distribution of ian virus 40 RNA. Nucleic Acids Res 1979;6(8):2879–99.
5-hydroxymethylcytosine and search for active demethyla- 49. Hashimoto SI, Green M. Multiple methylated cap sequences
tion intermediates. PLoS One 2010;5(12):e15367. in adenovirus type 2 early mRNA. J Virol 1976;20(2):425–35.
28. Freudenberg JM, Ghosh S, Lackford BL, et al. Acute depletion 50. Beemon K, Keith J. Localization of N6-methyladenosine in
of Tet1-dependent 5-hydroxymethylcytosine levels impairs the Rous sarcoma virus genome. J Mol Biol 1977;113(1):
LIF/Stat3 signaling and results in loss of embryonic stem cell 165–79.
identity. Nucleic Acids Res 2012;40(8):3364–77. 51. Cotney J, McKay SE, Shadel GS. Elucidation of separate, but
29. Tajaddod M, Jantsch MF, Licht K. The dynamic epitranscrip- collaborative functions of the rRNA methyltransferase-
tome: A to I editing modulates genetic information. related human mitochondrial transcription factors B1 and
Chromosoma 2016;125(1):51–63. B2 in mitochondrial biogenesis reveals new insight into
30. Melcher T, Maas S, Herb A, et al. A mammalian RNA editing maternally inherited deafness. Hum Mol Genet 2009;18(14):
enzyme. Nature 1996;379(6564):460–4. 2670–82.
31. Bazak L, Haviv A, Barak M, et al. A-to-I RNA editing occurs at 52. Chen M, Manley JL. Mechanisms of alternative splicing regu-
over a hundred million genomic sites, located in a majority lation: insights from molecular and genomics approaches.
of human genes. Genome Res 2014;24(3):365–76. Nat Rev Mol Cell Biol 2009;10(11):741–54.

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
20 | Chen et al.

53. Zhao X, Yang Y, Sun BF, et al. FTO-dependent demethyla- 75. Yang L, Ma Y, Han W, et al. Proteinase-activated receptor 2
tion of N6-methyladenosine regulates mRNA splicing and promotes cancer cell migration through RNA methylation-
is required for adipogenesis. Cell Res 2014;24(12):1403–19. mediated repression of miR-125b. J Biol Chem 2015;290(44):
54. Wang X, Zhao BS, Roundtree IA, et al. N(6)-methyladenosine 26627–37.
modulates messenger RNA translation efficiency. Cell 2015; 76. Elkashef SM, Lin AP, Myers J, et al. IDH mutation, competitive
161(6):1388–99. inhibition of FTO, and RNA methylation. Cancer Cell 2017;
55. Meyer KD, Patil DP, Zhou J, et al. 5’ UTR m(6)A promotes cap- 31(5):619–20.
independent translation. Cell 2015;163(4):999–1010. 77. van Es MA, Schelhaas HJ, van Vught PW, et al. Angiogenin
56. Slobodin B, Han R, Calderone V, et al. Transcription impacts variants in Parkinson disease and amyotrophic lateral scle-
the efficiency of mRNA translation via co-transcriptional rosis. Ann Neurol 2011;70:964–73.
n6-adenosine methylation. Cell 2017;169(2):326–37.e312. 78. Guy MP, Shaw M, Weiner CL, et al. Defects in tRNA anticodon
57. Yang Y, Fan X, Mao M, et al. Extensive translation of circular loop 2’-O-methylation are implicated in nonsyndromic X-
RNAs driven by N6-methyladenosine. Cell Res 2017;27(5): linked intellectual disability due to mutations in FTSJ1. Hum
626–41. Mutat 2015;36(12):1176–87.
58. Thomson JA, Itskovitz-Eldor J, Shapiro SS, et al. Embryonic 79. Jiang Q, Crews LA, Barrett CL, et al. ADAR1 promotes malig-
stem cell lines derived from human blastocysts. Science nant progenitor reprogramming in chronic myeloid leuke-
1998;282(5391):1145–7. mia. Proc Natl Acad Sci USA 2013;110(3):1041–6.
59. Batista PJ, Molinie B, Wang J, et al. m(6)A RNA modification 80. Chen L, Li Y, Lin CH, et al. Recoding RNA editing of AZIN1
controls cell fate transition in mammalian embryonic stem predisposes to hepatocellular carcinoma. Nat Med 2013;
cells. Cell Stem Cell 2014;15(6):707–19. 19(2):209–16.
60. Zhao BS, Wang X, Beadell AV, et al. m6A-dependent mater- 81. Gaisler-Salomon I, Kravitz E, Feiler Y, et al. Hippocampus-
nal mRNA clearance facilitates zebrafish maternal-to-
specific deficiency in RNA editing of GluA2 in Alzheimer’s
zygotic transition. Nature 2017;542(7642):475–8.
disease. Neurobiol Aging 2014;35(8):1785–91.
61. Ciccia A, Elledge SJ. The DNA damage response: making it
82. Klungland A, Dahl JA. Dynamic RNA modifications in dis-
safe to play with knives. Mol Cell 2010;40(2):179–204.
ease. Curr Opin Genet Dev 2014;26:47–52.
62. Sun Y, Yang Y, Shen H, et al. iTRAQ-based chromatin proteomic
83. Frayling TM, Timpson NJ, Weedon MN, et al. A common var-
screen reveals CHD4-dependent recruitment of MBD2 to sites
iant in the FTO gene is associated with body mass index and
of DNA damage. Biochem Biophys Res Commun 2016;471(1):142–8.
predisposes to childhood and adult obesity. Science 2007;
63. Wu G, Xiao M, Yang C, et al. U2 snRNA is inducibly pseudour-
316(5826):889–94.
idylated at novel sites by Pus7p and snR81 RNP. Embo j 2011;
84. Scuteri A, Sanna S, Chen WM, et al. Genome-wide associa-
30(1):79–89.
tion scan shows genetic variants in the FTO gene are associ-
64. Blanco S, Dietmann S, Flores JV, et al. Aberrant methylation
ated with obesity-related traits. PLoS Genet 2007;3(7):e115.
of tRNAs links cellular stress to neuro-developmental disor-
85. Ropers HH. Genetics of intellectual disability. Curr Opin Genet
ders. Embo J 2014;33(18):2020–39.
Dev 2008;18(3):241–50.
65. Meyer KD, Saletore Y, Zumbo P, et al. Comprehensive analy-
86. Skorupa A, King MA, Aparicio IM, et al. Motoneurons secrete
sis of mRNA methylation reveals enrichment in 3’ UTRs and
angiogenin to induce RNA cleavage in astroglia. J Neurosci
near stop codons. Cell 2012;149(7):1635–46.
2012;32(15):5024–38.
66. Lence T, Akhtar J, Bayer M, et al. m6A modulates neuronal
87. Ramaswami G, Li JB. RADAR: a rigorously annotated database
functions and sex determination in Drosophila. Nature 2016;
of A-to-I RNA editing. Nucleic Acids Res 2014;42(D1):D109–13.
540(7632):242–7.
88. Liu H, Flores MA, Meng J, et al. MeT-DB: a database of tran-
67. Gokhale NS, Horner SM. RNA modifications go viral. PLoS
Pathog 2017;13(3):e1006188. scriptome methylation in mammalian cells. Nucleic Acids Res
68. Lavi S, Shatkin AJ. Methylated simian virus 40-specific RNA 2015;43:D197–203.
from nuclei and cytoplasm of infected BSC-1 cells. Proc Natl 89. Sun WJ, Li JH, Liu S, et al. RMBase: a resource for
Acad Sci USA 1975;72(6):2012–6. decoding the landscape of RNA modifications from high-
69. Lichinchi G, Gao S, Saletore Y, et al. Dynamics of the human throughput sequencing data. Nucleic Acids Res 2016;44(D1):
and viral m(6)A RNA methylomes during HIV-1 infection of D259–65.
T cells. Nat Microbiol 2016;1(4):16011. 90. Picardi E, D’Erchia AM, Lo Giudice C, et al. REDIportal: a com-
70. Tirumuru N, Zhao BS, Lu W, et al. N(6)-methyladenosine of prehensive database of A-to-I RNA editing events in
HIV-1 RNA regulates viral infection and HIV-1 Gag protein humans. Nucleic Acids Res 2017;45(D1):D750–7.
expression. Elife 2016;5:e15528. 91. Ryvkin P, Leung YY, Silverman IM, et al. HAMR: high-
71. Gokhale NS, McIntyre AB, McFadden MJ, et al. N6-methyla- throughput annotation of modified ribonucleotides. RNA
denosine in Flaviviridae viral RNA genomes regulates infec- 2013;19(12):1684–92.
tion. Cell Host Microbe 2016;20(5):654–65. 92. Chen W, Tran H, Liang Z, et al. Identification and analysis of
72. Lichinchi G, Zhao BS, Wu Y, et al. Dynamics of human and the N(6)-methyladenosine in the Saccharomyces cerevisiae
viral RNA methylation during zika virus infection. Cell Host transcriptome. Sci Rep 2015;5(1):13859.
Microbe 2016;20(5):666–73. 93. Chen W, Feng P, Ding H, et al. iRNA-Methyl: identifying N(6)-
73. Ye F, Chen ER, Nilsen TW, Longnecker RM. Kaposi’s methyladenosine sites using pseudo nucleotide composi-
sarcoma-associated herpesvirus utilizes and manipulates tion. Anal Biochem 2015;490:26–33.
RNA N6-adenosine methylation to promote lytic replication. 94. Li YH, Zhang G, Cui Q. PPUS: a web server to predict PUS-
J Virol 2017;91(16):e00466-17. specific pseudouridine sites. Bioinformatics 2015;31(20):3362–4.
74. Dina C, Meyre D, Gallina S, et al. Variation in FTO contributes 95. Xiang S, Yan Z, Liu K, et al. AthMethPre: a web server for the
to childhood obesity and severe adult obesity. Nat Genet prediction and query of mRNA m6A sites in Arabidopsis thali-
2007;39(6):724–6. ana. Mol Biosyst 2016;12(11):3333–7.

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
RNA methylation and diseases | 21

96. Xiang S, Liu K, Yan Z, et al. RNAMethPre: a web server for the 118. Fu Y, Dominissini D, Rechavi G, et al. Gene expression regu-
prediction and query of mRNA m6A sites. PLoS One 2016; lation mediated through reversible m6A RNA methylation.
11(10):e0162707. Nat Rev Genet 2014;15(5):293–306.
97. Chen W, Feng P, Ding H, et al. Identifying N 6-methyladeno- 119. Jia CZ, Zhang JJ, Gu WZ. RNA-MethylPred: a high-accuracy
sine sites in the Arabidopsis thaliana transcriptome. Mol Genet predictor to identify N6-methyladenosine in RNA. Anal
Genomics 2016;291(6):2225–9. Biochem 2016;510:72–5.
98. Liu Z, Xiao X, Yu DJ, et al. pRNAm-PC: predicting N(6)- 120. Meng J, Cui X, Rao MK, et al. Exome-based analysis for RNA
methyladenosine sites in RNA sequences via physical- epigenome sequencing data. Bioinformatics 2013;29(12):
chemical properties. Anal Biochem 2016;497:60–7. 1565–7.
99. Zhou Y, Zeng P, Li YH, et al. SRAMP: prediction of mamma- 121. Antanaviciute A, Baquero-Perez B, Watson CM, et al.
lian N6-methyladenosine (m6A) sites based on sequence- m6aViewer: software for the detection, analysis and visual-
derived features. Nucleic Acids Res 2016;44(10):e91. ization of N6-methyl-adenosine peaks from m6A-seq/ME-
100. Chen W, Feng P, Tang H, et al. RAMPred: identifying the N(1)- RIP sequencing data. RNA 2017;23:1493–501.
methyladenosine sites in eukaryotic transcriptomes. Sci Rep 122. He C. Grand challenge commentary: RNA epigenetics? Nat
2016;6(1):31080. Chem Biol 2010;6(12):863–5.
101. Chen W, Tang H, Ye J, et al. iRNA-PseU: identifying RNA 123. Elif Erson-Bensan A, Begik O. m6A modification and implica-
pseudouridine sites. Mol Ther Nucleic Acids 2016;5:e332. tions for microRNAs. Microrna 2017;6:97–101.
102. Chen W, Tang H, Lin H. MethyRNA: a web server for identifi- 124. Amort T, Rieder D, Wille A, et al. Distinct 5-methylcytosine
cation of N6-methyladenosine sites. J Biomol Struct Dyn 2017; profiles in poly(A) RNA from mouse embryonic stem cells
35(3):683–7. and brain. Genome Biol 2017;18(1):1.
103. Chen W, Xing P, Zou Q. Detecting N6-methyladenosine sites 125. Bhat SS, Jarmolowski A, Szweykowska-Kulinska Z.
from RNA transcriptomes using ensemble support vector MicroRNA biogenesis: Epigenetic modifications as another
machines. Sci Rep 2017;7:40242. layer of complexity in the microRNA expression regulation.
104. Xing P, Su R, Guo F, et al. Identifying N6-methyladenosine Acta Biochim Pol 2016;63(4):717–23.
sites using multi-interval nucleotide pair position specificity 126. Carmichael SL, Ma C, Choudhry S, et al. Hypospadias and
and support vector machine. Sci Rep 2017;7:46757. genes related to genital tubercle and early urethral develop-
105. Chen W, Feng P, Yang H, et al. iRNA-AI: identifying the ment. J Urol 2013;190(5):1884–92.
adenosine to inosine editing sites in RNA sequences. 127. Bansal H, Yihua Q, Iyer SP, et al. WTAP is a novel oncogenic
Oncotarget 2017;8:4208–17. protein in acute myeloid leukemia. Leukemia 2014;28(12):
106. Feng P, Ding H, Yang H, et al. iRNA-PseColl: identifying the 1171–4.
occurrence sites of different RNA modifications by incorpo- 128. Jo HJ, Shim HE, Han ME, et al. WTAP regulates migration and
rating collective effects of nucleotides into PseKNC. Mol Ther invasion of cholangiocarcinoma cells. J Gastroenterol 2013;
Nucleic Acids 2017;7:155–63. 48(11):1271–82.
107. Li Y, Song S, Li C, et al. MeRIP-PF: an easy-to-use pipeline for 129. Shahid SU, Shabana, Rehman A, et al. Role of a common var-
high-resolution peak-finding in MeRIP-Seq data. Genomics iant of Fat Mass and Obesity associated (FTO) gene in obesity
Proteomics Bioinformatics 2013;11(1):72–5. and coronary artery disease in subjects from Punjab,
108. Meng J, Lu Z, Liu H, et al. A protocol for RNA methylation dif- Pakistan: a case control study. Lipids Health Dis 2016;15:29.
ferential analysis with MeRIP-Seq data and exomePeak R/ 130. Xiao S, Zeng X, Fan Y, et al. Gene polymorphism association
Bioconductor package. Methods 2014;69(3):274–81. with type 2 diabetes and related gene-gene and gene-
109. Rieder D, Amort T, Kugler E, et al. meRanTK: methylated environment interactions in a uyghur population. Med Sci
RNA analysis ToolKit. Bioinformatics 2016;32(5):782–5. Monit 2016;22:474–87.
110. Liu L, Zhang SW, Gao F, et al. DRME: count-based differential 131. Tan A, Dang Y, Chen G, et al. Overexpression of the fat
RNA methylation analysis at small sample size scenario. mass and obesity associated gene (FTO) in breast cancer
Anal Biochem 2016;499:15–23. and its clinical implications. Int J Clin Exp Pathol 2015;8:
111. Cui X, Meng J, Zhang S, et al. A novel algorithm for calling 13405–10.
mRNA m6A peaks by modeling biological variances in 132. Landfors M, Nakken S, Fusser M, et al. Sequencing of FTO
MeRIP-seq data. Bioinformatics 2016;32(12):i378–85. and ALKBH5 in men undergoing infertility work-up identi-
112. Yan Z, Liu K, Xiang S, et al. txCoords: a novel web application fies an infertility-associated variant and two missense
for transcriptomic peak re-mapping. IEEE/ACM Trans Comput mutations. Fertil Steril 2016;105(5):1170–9.e1175.
Biol Bioinform 2017;14(3):746–8. 133. Du T, Rao S, Wu L, et al. An association study of the m6A
113. Hauenschild R, Werner S, Tserovski L, et al. CoverageAnalyzer genes with major depressive disorder in Chinese Han popu-
(CAn): a tool for inspection of modification signatures in RNA lation. J Affect Disord 2015;183:279–86.
sequencing profiles. Biomolecules 2016;6(4):42. 134. Frye M, Dragoni I, Chin SF, et al. Genomic gain of 5p15 leads
114. Olarerin-George AO, Jaffrey SR. MetaPlotR: a Perl/R pipeline to over-expression of Misu (NSUN2) in breast cancer. Cancer
for plotting metagenes of nucleotide modifications and Lett 2010;289(1):71–80.
other transcriptomic sites. Bioinformatics 2017;33(10):1563–4. 135. Abbasi-Moheb L, Mertel S, Gonsior M, et al. Mutations in
115. Uyar B, Yusuf D, Wurmus R, et al. RCAS: an RNA centric NSUN2 cause autosomal-recessive intellectual disability.
annotation system for transcriptome-wide regions of inter- Am J Hum Genet 2012;90(5):847–55.
est. Nucleic Acids Res 2017;45(10):e91. 136. Heiss NS, Knight SW, Vulliamy TJ, et al. X-linked dyskerato-
116. Liu N, Zhou KI, Parisien M, et al. N6-methyladenosine alters sis congenita is caused by mutations in a highly conserved
RNA structure to regulate binding of a low-complexity pro- gene with putative nucleolar functions. Nat Genet 1998;19(1):
tein. Nucleic Acids Res 2017;45(10):6051–63. 32–8.
117. Liu N, Pan T. N6-methyladenosine-encoded epitranscrip- 137. Bellodi C, Krasnykh O, Haynes N, et al. Loss of function of the
tomics. Nat Struct Mol Biol 2016;23(2):98–102. tumor suppressor DKC1 perturbs p27 translation control

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November 2017
22 | Chen et al.

and contributes to pituitary tumorigenesis. Cancer Res 2010; 141. Nemlich Y, Greenberg E, Ortenberg R, et al. MicroRNA-medi-
70(14):6026–35. ated loss of ADAR1 in metastatic melanoma promotes
138. Sieron P, Hader C, Hatina J, et al. DKC1 overexpression asso- tumor growth. J Clin Invest 2013;123(6):2703–18.
ciated with prostate cancer progression. Br J Cancer 2009; 142. Chen T, Hao YJ, Zhang Y, et al. m(6)A RNA methylation is
101(8):1410–6. regulated by microRNAs and promotes reprogramming to
139. Patton JR, Bykhovskaya Y, Mengesha E, et al. Mitochondrial pluripotency. Cell Stem Cell 2015;16(3):289–301.
myopathy and sideroblastic anemia (MLASA): missense 143. Qin YR, Qiao JJ, Chan TH, et al. Adenosine-to-inosine RNA
mutation in the pseudouridine synthase 1 (PUS1) gene is editing mediated by ADARs in esophageal squamous cell
associated with the loss of tRNA pseudouridylation. J Biol carcinoma. Cancer Res 2014;74(3):840–51.
Chem 2005;280(20):19823–8. 144. Tomaselli S, Galeano F, Alon S, et al. Modulation of
140. Yang Z, Li J, Feng G, et al. MicroRNA-145 modulates N6- microRNA editing, expression and processing by ADAR2
methyladenosine levels by targeting the 3’-untranslated deaminase in glioblastoma. Genome Biol 2015;16(1):5.
mRNA region of the N6-methyladenosine binding YTH 145. Berulava T, Rahmann S, Rademacher K, et al. N6-adenosine
domain family 2 protein. J Biol Chem 2017;292(9):3614–23. methylation in MiRNAs. PLoS One 2015;10(2):e0118438.

Downloaded from https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbx142/4641716


by guest
on 19 November
View2017
publication stats