Beruflich Dokumente
Kultur Dokumente
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 1
Abstract—Cervical cancer is a leading severe malignancy throughout the world. Molecular processes and biomarkers leading to tumor
progression in cervical cancer are either unknown or only partially understood. An increasing number of studies have shown that
microRNAs play an important role in tumorigenesis so understanding the regulatory mechanism of miRNAs in gene-regulatory network
will help elucidate the complex biological processes that occur during malignancy. Functional genomics data provides opportunities to
study the aberrant microRNA-messenger RNA (miRNA-mRNA) interaction. Identification of miRNA-mRNA regulatory modules will aid
deciphering aberrant transcriptional regulatory network in cervical cancer but is computationally challenging. In this regard, an
algorithm, termed as relevant and functionally consistent miRNA-mRNA modules (RFCM3 ), is proposed. It integrates miRNA and
mRNA expression data of cervical cancer for identification of potential miRNA-mRNA modules. It selects set of miRNA-mRNA modules
by maximizing relation of mRNAs with miRNA and functional similarity between selected mRNAs. Later using the knowledge of
miRNA-miRNA synergistic network different modules are fused and finally a set of modules are generated containing several miRNAs
as well as mRNAs. This type of module explains the underlying biological pathways containing multiple miRNAs and mRNAs. The
effectiveness of the proposed approach over other existing methods has been demonstrated on a miRNA and mRNA expression data
of cervical cancer with respect to enrichment analyses and other standard metrices. The prognostic value of the genes in a module
with respect to cervical cancer is also demonstrated. The approach was found to generate more robust, integrated, and functionally
enriched miRNA-mRNA modules in cervical cancer.
Index Terms—Cervical cancer, Biomarkers, Regulatory network, Modules, Algorithm, Functionally enriched.
1 I NTRODUCTION
1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 2
obtained from computational prediction may increase and 2 DATA S ETS U SED
that requires additional steps to remove them. In order to The miRNA and mRNA expression of cervical cancer
regulate the targeted genes, miRNAs bind to 3 prime UTR data sets are downloaded from TCGA (https://tcga-
regions of those genes, keeping this in view several meth- data.nci.nih.gov/tcga/). The mRNA expression dataset,
ods are developed focusing on identification of conserved TCGA CESC exp HiSeqV2 (2015-02-24) contains expres-
sequence regions between miRNAs and mRNAs [13]. How- sion of 20,130 genes in 308 samples and the miRNA ex-
ever, sequence-based approaches cannot identify functional pression dataset, TCGA CESC miRNA-HiSeq (2015-02-24)
changes in genes and also generates many false positive contains expression of 333 miRNAs in same set of 308
bindings sites. Integrated miRNA and related mRNA analy- samples. For incorporating miRNA-miRNA synergistic in-
ses in different types of cancer have been the focus of many teraction information pairwise MISIM functional similarity
studies [14]–[20]. To identify potential interactions between of 270 miRNAs involved in cervical cancer are used [31].
miRNAs, mRNAs, and pathways involved in cancer devel-
opment, many studies used large-scale miRNA and mRNA
3 P ROPOSED MI RNA- M RNA M ODULE I DENTIFI -
expression profile datasets [14]–[16].
CATION A LGORITHM
Signaling pathways or biological functions associated
with a specific cancer can suffer from severe significant Various studies have suggested that gene expression in
alteration due to alteration in miRNA-mRNA relationships. both plants and animals are essentially regulated by one or
With the advances in high throughput technologies, large- sometimes more than one miRNA. miRNAs target mRNAs
scale availability of miRNA and mRNA expression datasets to regulate biological processes. In gene regulatory networks
from the same set of patients have become available due both miRNAs and mRNAs interact. Studying combined
to collaborative efforts such as The Cancer Genome Atlas relationships between groups of miRNAs and groups of
(TCGA) project [21], [22]. Proper inspection and biological mRNAs reveals the in-depth information about gene reg-
analysis of the expression data of both types of biomarkers, ulation and cell functions, rather than studying individual
namely, miRNA and mRNA can be helpful in the iden- miRNA and mRNA relationships. Various studies have been
tification of disease-associated miRNA-mRNA regulatory conducted to identify MMRMs with a common aim to eluci-
modules. date miRNA-mRNA regulatory relationships. However, the
Although, a few algorithms for finding miRNA-mRNA combined relationships between a group of miRNAs and
modules have been proposed, improvements are still the group of targeted mRNAs in the process of discovering
needed. Most of these approaches need data integration at MMRMs are not considered by existing methods. Accord-
several steps in the form of gene-gene interactions, miRNA- ingly, a method is required that can discover a regulatory
gene interactions and transcription factor-gene interactions module containing multiple miRNAs and mRNAs. In this
derived from information based databases. Mirsynergy [23] paper, a mutual information based approach is proposed to
uses TargetScanHuman 6.2 [24] for miRNA-target site ma- identify regulatory modules containing multiple miRNAs
trix and TRANSFAC [25] and BioGrid [26] for gene-gene and mRNAs both sharing certain level of relationships.
interactions including transcription factors and protein- Diagrammatic representation of the work-flow of proposed
protein interactions, respectively. Similarly, SNMNMF [14] algorithm RF CM 3 is shown in Figure 1.
uses MicroCosm website (http://www.ebi.ac.uk/enright-
srv/microcosm/htdocs/targets/v5/) for miRNA-gene in- 3.1 Relevant and Functionally Consistent miRNA-
teractions, TRANSFAC [25] for DNA-protein interactions mRNA Module
and combines protein-protein interaction data obtained Let X = {X1 , · · · , Xi , · · · , Xj , · · · , Xp } denotes a set of
from Bossi and Lehner (2009) [27]. MAGIA [28] is a web- miRNAs and Y = {Y1 , · · · , Yi , · · · , Yj , · · · , Yq } denotes a
based tool, which also uses TargetScan/PicTar/PITA [24], set of mRNAs such that X ∩ Y = φ. Given, DX and DY
[29], [30] database for incorporating miRNA-target site in- datasets, having n matching miRNA and mRNA expression
teractions in it's different algorithms. The above mentioned samples, the objective of current study is to identify X∗ ⊂ X
approaches have high time complexity as well as they get and Y∗ ⊂ Y, such that X∗ and Y∗ are related, as a result
executed in multiple steps rather than in automatic fashion. miRNAs X∗ collaboratively interacting with mRNAs in Y∗
In this regard, this paper presents a mutual information and vice versa.
based method, termed as relevant and functionally con- For discovery of miRNA-mRNA regulatory modules
sistent miRNA-mRNA module (RF CM 3 ) for selection of a two stage approach namely, Relevant and Functionally
important MMRMs in cervical cancer. The proposed method Consistent miRNA-mRNA Module (RF CM 3 ) is proposed
selects regulatory modules in an automatic manner which in this paper. Two measures, a mutual information based
requires less time. Module selection is done by maximiz- algorithm and miRNA-miRNA synergistic interaction, are
ing relatedness between miRNA and genes as well as by incorporated in the two stages, respectively. Next, the mu-
maximizing functional similarity between the genes of that tual information based approach is described first and then
module. Later miRNA functional similarity matrix (MISIM the miRNA-miRNA synergistic interaction information and
data for cervical cancer [31]) is used. This helps the pro- its integration is presented in the current framework.
posed algorithm in producing significant modules having
multiple miRNAs and mRNAs. Implementation of the pro- 3.1.1 Mutual Information for Identifying Relationships Be-
posed algorithm could help in identification of functional tween Multiple mRNAs and a miRNA
relationship between these biomarkers, which will unravel This section describes about the method that is used to
key mechanisms involved in cervical cancer pathogenesis. generate a star shaped module containing single miRNA
1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 3
of the mRNA with respect to class label or miRNA. Also, As a result of that, Yj ∈ Θ and C = C \ Yj .
it infers about the dependency of the class label X on
an attribute. Here, the relevance of the mRNA Yi with
respect to class labels /miRNAs X is defined as fˆ(Yi , X). Mutual information is used to compute both relevance and
Whereas, f˜(Yi , Yj ) is defined as the functional similarity of functional similarity of a mRNA. The relevance and func-
the mRNA Yj with respect to the mRNA Yi . In this study, tional similarity of a mRNA are calculated using (1) and (3),
for calculation of both relevance and functional similarity respectively.
1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 4
The expression values of both miRNA and mRNA in The major metrices for evaluating the performance of
the data are continuous in nature. Continuous expression different algorithms are functional consistency, functional
values of a miRNA and mRNA are needed to be discretized enrichment, and survival analysis. The description of
for calculation of relevance of a mRNA with respect to each metric is provided where they are introduced
miRNA or clinical outcome using mutual information. The in the manuscript. Literature study of some of the
marginal probabilities and the joint probability are com- interactions are also presented. The proposed RFCM3
puted using discretized expression values of a mRNA and algorithm is implemented in C language and run in
miRNA. These probabilities are later used to compute the LINUX environment having machine configuration
mRNA-class/miRNA relevance. Therefore, discretization of iCore i7-6700CPU, 3.6 GHz, 6 MB cache, and 4 GB
continuous valued miRNAs and mRNAs is a very vital RAM. The source code of the proposed algorithm
step in the current study. In the current study discretization and the supplementary information are available at
method mentioned in [33] is used. This method discretizes http://home.iitj.ac.in/∼sushmitapaul/CBL/softwares.html.
expression values of a miRNA and mRNA using mean µ
and standard deviation σ that are computed over n expres- 4.1 Optimum Value of Threshold δ
sion values of that particular miRNAs or mRNA. Next, the
values bigger than (µ + σ) is represented as 1, the values The proposed algorithm iteratively computes functional
between (µ−σ) and (µ+σ) as 0 and the values smaller than similarity of each mRNA with respect to already selected
(µ − σ) as −1. The over-expression, baseline, and under- mRNAs and selects next mRNA for the module. If a mRNA
expression of the miRNAs or mRNAs correspond to these has functional similarity value with already selected mRNA
three values. lesser than δ then it is not considered further. The value
of δ has been varied from 0.1 to 0.5. It was observed
3.1.2 Incorporation of miRNA-miRNA Synergistic Interac- that at 0.15 the algorithm could generate modules having
tion Information maximum KPES (Kegg Pathway Enrichment Score) value
as shown in Figure 2. At this value of δ the algorithm was
From the expression data of 20,130 genes and 333 miRNAs
able to generate maximum number of modules which were
in 308 samples of cervical cancer, 333 miRNAs-mRNAs reg-
enriched by significant KEGG pathway terms. Therefore, the
ulatory modules were generated by maximizing relatedness
optimal value of δ was set to 0.15. KPES is used here to
between miRNAs and genes as well as by maximizing func-
evaluate the functional consistency of the modules. Higher
tional similarity between the genes, each module contained
the value of KPES, more significant are the modules. It is
one miRNA and a maximum of fifty interacting genes.
calculated by multiplying total number of enriched modules
Further to find interaction between miRNAs of different
(M) to the sum of -log10 (FDR), of all significant KEGG
star shaped modules miRNA similarity matrix (MISIM) [31]
pathway terms (K) divided by total number of significant
information was incorporated in the algorithm. This helped
KEGG pathway terms. FDR (false discovery rate) is the ratio
in generation of such modules which are highly correlated
of the number of false positive results to the number of total
to cervical cancer and contained more than one miRNA. The
positive test results (expected proportion of type I error ).
MISIM matrix is generated based on the concept that genes
In the current study DAVID annotation tool [22], [36] was
with similar functions are often associated with similar
used to do enrichment analysis. Following is the expression
diseases, and the relationship of different diseases can be
of KPES:
represented by a structure of directed acyclic graph (DAG)
that is also true for miRNA. Therefore, it is feasible to infer MX
K
miRNA functional similarity by measuring the similarity KP ES = −log10 (F DRk ) (6)
K k
of their associated disease DAG. Moreover, using MISIM
information insignificant modules were also dropped from
further analyses. The MISIM is a matrix containing miRNA-
4.2 Optimum Value of MISIM Threshold
miRNA interaction similarity scores. The miRNAs in the
matrix are related to cervical cancer and the score value is The miRNA-miRNA similarity value in MISIM matrix
normalized between 0 to 1, where 0 means no interaction varies from 0 to 1. Also, choice of cut-off value plays
and 1 means highly interactive. an important role to fuse modules. Hence, it has a direct
influence on the performance of proposed algorithm. Here,
in this study a range of score values (0.7 to 1) was selected as
4 E XPERIMENTAL S ETUP a cut off for merging multiple miRNA-mRNA modules gen-
In the present research work, the performance of the erated by the RF CM 3 . Those modules were fused whose
proposed relevant and functionally consistent miRNA- miRNAs shared higher interaction similarity score than
mRNA module (RF CM 3 ) algorithm has been compared the cut-off value. As a result of that insignificant modules
with that of Mirsynergy [23], SNMNMF [14], Weighted were filtered out having low interaction similarity scores.
correlation network analysis (WGCNA) [34], and MAGIA Finally, multiple modules were generated containing more
(PITA algorithm) [28]. STRING database [21] was used than one miRNA and multiple genes. Table 1 represents the
to generate gene-gene interaction (GGI) data matrix H number of modules generated at different MISIM cut-off
and gene-gene interaction network for Mirsynergy and score values. The miRNA-mRNA networks were generated
SNMNMF, respectively. TransmiR database [35] was used using Cytoscape version 3.6.1 [37].
to generate miRNA-target site matrix W and miRNA-gene With the functional enrichment analysis later the opti-
interactions for Mirsynergy and SNMNMF, respectively. mum value of cut-off score was decided to be > = 0.7, as
1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 5
TABLE 1 Fig. 3. miRNA interactions across all modules at cut-off > = 0.7 gener-
Modules Description at Different Cut-off Values ated by RF CM 3
1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 6
ergy are very close, all the modules generated by both the
Note:- M:module number; Avg.miR and Avg.mR:average miRNA and mRNA
methods have two or more than two miRNAs. Moreover, the per module; PPI:Percentage of modules having PPI enrichment p-value<0.05;
maximum number of miRNAs in a single module generated KPES:KEGG Pathway enrichment score; Time:number of millisecond.
by RFCM3 is 14. In contrast, one of the modules obtained
by Mirsynergy contained maximum of 13 miRNAs. There
is huge difference between the average number of genes
per module in the RFCM3 and Mirsynergy. Notably, there
are considerably fewer genes in the Mirsynergy modules
than in modules identified by RFCM3 , which shows a
denser network generated by the RFCM3 responsible for
specific biological process or pathway, which is supported
by the functional enrichment analysis described later in the
manuscript.
To further illustrate the interactions among the selected
set of genes in a module, protein-protein interaction net-
works (PPIN) were built up with the results generated by
the STRING database [21]. It builds a connection between
two genes/protein based on the strong sequence based
similarity, a very frequent co-occurrence in documents, and
similar results in experiments between two proteins. In
this study, experiment based information has been used
to create connection between two proteins/genes. Finally, Fig. 5. PPINs generated by STRING for the most significant
module of RF CM 3 (p-value < 1.0E-16)
it generates a network and its significance is assessed by
P -value < 0.05. The percentage of modules that generated
significant PPI networks are also compared. From the table
it is seen that the proposed RFCM3 algorithm generates
more significant PPI networks compared to other methods.
It indicates that the genes present in most of the modules
of the RFCM3 were strongly related to each other and carry
out a specific biological function in cervical cancer. The PPI
network of the most significant module generated by the
RFCM3 and Mirsynergy are presented in Figure 5 and 6,
respectively. As shown in Table 2, KPES is also maximum
in the case of modules generated by RFCM3 (22.7), whereas
Mirsynergy, SNMNMF and WGCNA gave negative values,
which suggests that the other methods could not identify
functionally enriched modules.
As far as the time complexity of different algorithms
is concerned, as shown in Table 2, RF CM 3 cosumes Fig. 6. PPINs generated by STRING for the most significant
less time (2.32E+06 ms) to generate significant modules module of Mirsynergy (p-value < 1.81E-05)
when compared with Mirsynergy (3.74E+06 ms) and SNM-
NMF (2.46E+06 ms), whereas WGCNA (2.31E+06 ms) and
RF CM 3 takes near about same time. Though MAGIA web one of the main cervical cancer pathway, that is, human
tool takes very less time of 1.40E+05 ms but is not able to papilloma virus (HPV) infection pathway (hsa05165) having
generate any significant module. P -value < 0.05. Figure 7 clearly shows that there is huge
difference between the total module generated and enriched
modules in case of Mirsynergy (8/37) compared to RFCM3
4.4 Evaluating Modules by Functional Enrichments (14/22). Although, all the modules generated by WGCNA
For the biological interpretation of modules generated by were enriched by pathway terms, none of them were related
different algorithms, pathway enrichment analysis of the to hsa05165 or any other cervical cancer related pathways.
genes were done using DAVID:Functional Annotation Tool. Similarly for SNMNMF, out of the 4 modules only two
14 out of 22 modules generated by RFCM3 and only 8 out of them were enriched but the terms were not related to
of 37 modules generated by Mirsynergy were found to be hsa05165 or any other cervical cancer related pathways. In
significantly enriched by KEGG pathway terms related to case of MAGIA not a single module out of 18 modules were
1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 7
enriched. Modules generated by MAGIA were not taken Total Vs Enriched Modules
into further consideration because none of the module was 40
Total
Enriched
enriched. Examining enriched pathway terms exclusive to 35
modules generated by RFCM3 revealed several interesting
HPV infection pathway or cervical cancer related terms, 30
Number of Modules
25
p53 signaling pathway, and so forth. Table 3 summarizes
list of significant KEGG pathway terms from modules gen- 20
1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 8
80
ature search along with the help of MiRTarBase database
(http://miRTarBase.mbc.nctu.edu.tw/) which contains ex-
60
perimentally validated miRNA-target interactions (MTIs) is
used here to biologically validate the miRNA-mRNA inter-
40
actions obtained by RF CM 3 algorithm. miRTarBase serves
as an important repository for experimentally validated
20
MTIs, which are frequently updated by manually surveying
research articles. Among all the miRNA-mRNA interactions
0
−10 −8 −6 −4 −2 0 2
log10(FDR) obtained by RF CM 3 algorithm at the optimum cut-off
value of 0.7, many of the MTIs are found to be biologically
relevent in different studies. Some of the interactions from
Fig. 9. Significant pathways generated by Mirsynergy, SNMNMF,
WGCNA, and RF CM 3 different modules are discussed further.
• As suggested in [41], the inflammatory molecules
may be up-regulated in human intracranial
aneurysms as a response to decrease in regulatory
into high risk group (in red curve) and low risk group (in miRNA like hsa-mir-204, which is down regulated
green curve) taking median of average of gene expression in such tissues and are validated to target CCR5.
values. From the Kaplan-Meier plot (Figure 10), it was found • With the use of the miRNA-target screening system
that the gene set selected by both the proposed method composed of a self-assembled cell microarray (SAM-
and Mirsynergy were significantly able to separate two cell), hsa-miR-204 was identified to regulate Nox2
risk groups. From both the figures, it can be deduced that (CYBB) expression and it's downstream products in
the proposed method outperformed Mirsynergy in terms both human and mouse macrophages [42].
of hazard ratio, p-value of hazard ratio, log-rank test and • Elucidation of transcriptome-wide microRNA bind-
concordance index. Therefore, the results indicate that the ing sites in human cardiac tissues by Ago2 HITS-
selected genes from the module of the RF CM 3 are highly CLIP revealed that hsa-mir-296 targets RPRD2 [43].
correlated with patient survival. Hence they may be consid- • Interaction of miRNA hsa-mir-106a with three of the
ered as potential prognostic markers. genes namely REST, ATXN7L3B, and NR2C2 is sup-
1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 9
Fig. 10. Comparison of Kaplan-Meier curves for patients with cervical cancer plotted for combined expression of 20 genes obtained by the
Mirsynergy and 242 genes obtained by the RF CM 3
ported by photoactivatable-ribonucleoside-enhanced ated with human AGO1, reported data sets of more
cross-linking and immunoprecipitation (PAR-CLIP) than 18,000 high-confidence miRNA-mRNA interac-
method [44], [45] and with SMG1 is supported by tions, which also includes some of the interactions
mapping the human miRNA interactome by CLASH obtained by the proposed method, like interaction
[46]. between hsa-mir-222 with three of the genes namely
• Elucidation of trancriptome-wide microRNA bind- RPL8, RPL12 and RPS2 also interaction between hsa-
ing sites in human cardiac tissues by Ago2 HITS- mir-221 with two of the genes namely RPS24 and
CLIP revealed that hsa-mir-106b targets PRKCB [43]. RPLP0.
• TP53-mediated regulation of AGO2-miRNA interac- In addition with the above discussed interactions other
tion represents a mechanism of miRNA regulation such validated miRNA-gene interactions are also enlisted in
in carcinogenesis, which indicates hsa-mir-93 targets Table 4 along with the experimental methods used for their
TSKU [47]. validation, miRTarBase ID and suitable references.
• The results from the study [48], suggests that hsa- Also, massive literature search showed that all the 12
let-7g may suppress hepatocellular carcinoma (HCC) miRNAs present in the most significant module generated
metastasis partially through targeting COL1A2, by RF CM 3 were found to regulate their target genes in
moreover addition of COL1A2 counteracts the in- certain type of cancer, out of which 7 of them were found
hibitory effect of hsa-let-7g. to regulate their target genes specifically in cervical cancer.
• Mapping the human miRNA interactome by CLASH Table 6 represents the detailed information about these 12
reveals that hsa-let-7b targets KLHL11 and SMG1 miRNAs, their target genes in specific cancer and type of
by noncanonical binding [46]. regulation along with the PubMed Id (PMID) of the study.
• In the study [49], impact of hsa-mir-34a was ex-
amined on pro-apoptotic/anti-apoptotic gene using 5 C ONCLUSION AND F UTURE D IRECTION
PCR array revealed BIRC6 as one of its target.
• Random or site-specific incorporation of photoacti- By representing the molecular interactions underlying bi-
vatable nucleoside analogs into RNA in vitro has ological processes, network biology paves the path to
been used to probe RBP and RNP-RNA interactions drug discovery, better understanding of disease mechanism
along with in vivo crosslinking of RNA-protein com- and cancer therapeutics [51]. miRNAs are mostly 21-23
plex, which were later isolated by immunoprecipita- nucleotide-long non-coding RNAs, these biological units
tion in the study [47], helped to validate interactions with low complexity, easy detection and high stability can
between hsa-mir-20a and RACGAP1, hsa-mir-107 regulate gene expression at translational level and also by
and CCNT1, hsa-mir-144 and ATXN1L and hsa-mir- mRNA degradation [6], [7]. miRNAs play fundamental roles
130b and ATEN1. in differentiation and development. They are also involved
• By the help of PAR-CLIP technology, which allows in biological mechanisms underlying tumorigenesis which
the direct and transcriptome-wide identification of makes them promising biomarkers for several types of
miRNA targets, the target sites for all the viral and cancers. Discovery and accurate characterization of miRNA-
cellular miRNAs expressed in PEL cell lines in the mRNA regulatory modules in cervical cancer has became
study [50] revealed the interaction between hsa-mir- feasible due to the availability of miRNA-miRNA functional
107 and ERN1. similarity and miRNA and mRNA expression profiles from
• The technique applied in the study [46] for ligation the same patients.
and sequencing of miRNA-target duplexes associ- In this article, an algorithm has been proposed; namely,
relevant and functionally consistent miRNA-mRNA module
1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 10
TABLE 5
Experimental validation of some of the miRNA-mRNA interactions obtained by RF CM 3
Note:- HITS-CLIP:High-throughput sequencing of RNA isolated by crosslinking immunoprecipitation, PAR-CLIP:photoactivatable ribonucleoside-enhanced crosslink-
ing and immunoprecipitation, ELISA:enzyme-linked immunosorbent assay, qRT-PCR:quantitative real time polymerase chain reaction, CLASH:crosslinking, ligation,
and sequencing of hybrids.
1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 11
1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 12
Barrientos, J. G. Tamez-Pena, and V. Trevino, “SurvExpress: An Sushmita Paul Sushmita Paul received the BSc,
Online Biomarker Validation Tool and Database for Cancer Gene MSc, and PhD degrees from University of Ra-
Expression Data Using Survival Analysis,” PLoS One, vol. 8, no. 9, jasthan, Banasthali Vidyapith, and University of
p. e74250, 2013. Calcutta, respectively. After obtaining her PhD
[41] M. Holcomb, Y. H. Ding, D. Dai, R. J. McDonald, J. S. McDonald, degree in January 2014, she was associated
D. F. Kallmes, and R. Kadirvel, “RNA-Sequencing Analysis of with Indian Statistical Institute, Kolkata as Vist-
Messenger RNA/MicroRNA in Rabbit Aneurysm Model Identi- ing Scientist (January 2014 to July 2014). Later,
fies Pathways and Genes of Interest,” American Journal of Neurora- she joined the University Hospital Erlangen, Ger-
diology, vol. 36, no. 9, pp. 1710–1715, 2015. many as a Postdoctoral Research Fellow (Au-
[42] J. Yang, M. E. Brown, H. Zhang, M. Martinez, Z. Zhao, S. Bhutani, gust 2014 to December 2015). She also worked
S. Yin, D. Trac, J. Jeff, and M. E. Davis, “High-throughput Screen- as Scientist at TCS Innovation Labs Kolkata,
ing Identifies MicroRNAs that Target Nox2 and Improve Function India (December 2015 to May 2016). Currently, she is an assistant
after Acute Myocardial Infarction,” American Journal of Physiology- professor in the Department of Bioscience and Bioengineering, Indian
Heart and Circulatory Physiology, vol. 312, no. 5, pp. H1002–H1012, Institute of Technology Jodhpur, India. Her research interests include
2017. computational biology and bioinformatics, pattern recognition, soft com-
[43] R. M. Spengler, X. Zhang, C. Cheng, J. M. McLendon, J. M. Skeie, puting, and so forth. She has published more than 30 papers in in-
F. L. Johnson, B. L. Davidson, and R. L. Boudreau, “Elucidation of ternational journals and conferences, and 2 book chapters. She is a
Transcriptome-wide MicroRNA Binding Sites in Human Cardiac co-author of a book published by Springer-Verlag, London, and also
Tissues by Ago2 HITS-CLIP,” Nucleic Acids Research, vol. 44, no. 15, a reviewer of many international journals. Dr. Paul was a recipient
pp. 7120–31, 2016. of 2017 Early Career Research Award from Science and Engineering
[44] S. Memczak, M. Jens, A. Elefsinioti, F. Torti, J. Krueger, A. Rybak, Research Board, Department of Science and Technology, Government
L. Maier, S. D. Mackowiak, L. H. Gregersen, M. Munschauer, of India, 2017 Bioclues Innovation, Research and Development (BIRD)
A. Loewer, U. Ziebold, M. Landthaler, C. Kocks, F. le Noble, and award, and 2009 Best Paper Award of the International Conference on
N. Rajewsky, “Circular RNAs are a Large Class of Animal RNAs Information Technology from the Orissa Information Technology Society,
with Regulatory Potency,” Nature, vol. 495, no. 7441, pp. 333–8, India.
2013.
[45] A. W. Whisnant, H. P. Bogerd, O. Flores, P. Ho, J. G. Powers,
N. Sharova, M. Stevenson, C. H. Chen, and B. R. Cullen, “In-depth
Analysis of the Interaction of HIV-1 with Cellular MicroRNA Bio-
genesis and Effector Mechanisms,” mBio, vol. 4, no. 2, p. e000193, Madhumita Madhumita received the BSc in
2013. Botany, Zoology and Chemistry and MSc in
[46] A. Helwak, G. Kudla, T. Dudnakova, and D. T. D, “Mapping Life Sciences with specialization in Bioinfor-
the Human miRNA Interactome by CLASH Reveals Frequent matics from Regional Institute of Education,
Noncanonical Binding,” Cell, vol. 153, no. 3, pp. 654–65, 2013. Bhubaneswar, India and Central University of
[47] M. Hafner, M. Landthaler, L. Burger, M. Khorshid, J. Hausser, Punjab, Bathinda, India, respectively. Currently,
P. Berninger, A. Rothballer, M. J. Ascano, A. C. Jungkamp, M. Mun- she is a PhD scholar at Department of Bio-
schauer, A. Ulrich, G. S. Wardle, S. Dewell, M. Zavolan, and science and Bioengineering, Indian Institute of
T. Tuschl, “Transcriptome-wide Identification of RNA-binding Technology Jodhpur, India. Her research inter-
Protein and MicroRNA Target Sites by PAR-CLIP,” Cell, vol. 141, ests include computational biology and bioinfor-
no. 1, pp. 129–41, 2010. matics, pattern recognition, soft computing, and
[48] J. Ji, L. Zhao, A. Budhu, M. Forgues, H. L. Jia, L. X. Qin, Q. H. Ye, so forth. Madhumita is a recipient of Gold Medal from Central University
J. Yu, X. Shi, Z. Y. Tang, and X. W. Wang, “Let-7g Targets Collagen of Punjab, Bathinda, India for outstanding performance during her MSc
Type I Alpha2 and Inhibits Cell Migration in Hepatocellular in the year 2017.
Carcinoma,” Journal of Hepatology, vol. 52, no. 5, pp. 690–7, 2010.
[49] R. A. Yacoub, I. O. Fawzy, R. A. Assal, K. A. Hosny, A. N. Zekri,
G. Esmat, H. M. E. Tayebi, and A. I. Abdelaziz, “miR-34a: Multiple
Opposing Targets and One Destiny in Hepatocellular Carcinoma,”
Journal of Clinical and Translational Hepatology, vol. 4, no. 4, pp. 300–
305, 2016.
[50] E. Gottwein, D. L. Corcoran, N. Mukherjee, R. L. Skalsky,
M. Hafner, J. D. Nusbaum, P. Shamulailatpam, C. L. Love, S. S.
Dave, T. Tuschl, U. Ohler, and B. R. Cullen, “Viral MicroRNA
Targetome of KSHV-infected Primary Effusion Lymphoma Cell
Lines,” Cell Host Microbe, vol. 10, no. 5, pp. 515–26, 2011.
[51] I. Koturbash, F. J. Zemp, I. Pogribny, and O. lgaKovalchuk, “Small
Molecules with Big Effects: The Role of the MicroRNAome in
Cancer and Carcinogenesis,” Mutation Research, vol. 722, no. 2, pp.
94–105, 2011.
1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.