Sie sind auf Seite 1von 12

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 1

RFCM3: Computational Method for Identification


of miRNA-mRNA Regulatory Modules in
Cervical Cancer
Sushmita Paul and Madhumita

Abstract—Cervical cancer is a leading severe malignancy throughout the world. Molecular processes and biomarkers leading to tumor
progression in cervical cancer are either unknown or only partially understood. An increasing number of studies have shown that
microRNAs play an important role in tumorigenesis so understanding the regulatory mechanism of miRNAs in gene-regulatory network
will help elucidate the complex biological processes that occur during malignancy. Functional genomics data provides opportunities to
study the aberrant microRNA-messenger RNA (miRNA-mRNA) interaction. Identification of miRNA-mRNA regulatory modules will aid
deciphering aberrant transcriptional regulatory network in cervical cancer but is computationally challenging. In this regard, an
algorithm, termed as relevant and functionally consistent miRNA-mRNA modules (RFCM3 ), is proposed. It integrates miRNA and
mRNA expression data of cervical cancer for identification of potential miRNA-mRNA modules. It selects set of miRNA-mRNA modules
by maximizing relation of mRNAs with miRNA and functional similarity between selected mRNAs. Later using the knowledge of
miRNA-miRNA synergistic network different modules are fused and finally a set of modules are generated containing several miRNAs
as well as mRNAs. This type of module explains the underlying biological pathways containing multiple miRNAs and mRNAs. The
effectiveness of the proposed approach over other existing methods has been demonstrated on a miRNA and mRNA expression data
of cervical cancer with respect to enrichment analyses and other standard metrices. The prognostic value of the genes in a module
with respect to cervical cancer is also demonstrated. The approach was found to generate more robust, integrated, and functionally
enriched miRNA-mRNA modules in cervical cancer.

Index Terms—Cervical cancer, Biomarkers, Regulatory network, Modules, Algorithm, Functionally enriched.

1 I NTRODUCTION

T HE third most commonly diagnosed cancer and the


second leading cause of cancer death in females all
around the world is cervical cancer. It accounts for 9%
will help to understand the role of already known biomark-
ers like miRNAs and mRNA in various pathways involved
in cervical cancer.
(529,800) of the total new cancer cases and 8% (275,100) miRNAs are regulatory, approximately 21-23 nucleotides
of the total cancer deaths among females [1]. According to non-coding RNAs. They are found in eukaryotes. miRNA
National Cancer Institute, females suffering from regional expression takes place at specific stages of tissue develop-
and distant cervical cancer have 5-year survival rates of ment or cell differentiation and also plays deterministic role
57% and 16%, respectively and in case if the disease is at the post transcriptional level by affecting the expression
diagnosed at localized state, the 5-year survival rate is 91%. of variety of genes. miRNAs base pairs with their target
Generally, it is diagnosed late after its onset, that is, between mRNAs and later degrades or suppresses translation of
the age of 35 and 44. Studies suggest that about 15% of these the targeted transcripts [4], [5]. Expression levels of more
are detected in women of age over 65. Nevertheless, data than a dozen mRNAs are simultaneously regulated by a
has been reported where women under the age group of single miRNA. Also, any given mRNA sequence may be
20 are also diagnosed with cervical cancer. Unfortunately, targeted by several different miRNAs [6]–[8]. Moreover,
despite all the recent advancement in diagnosis and treat- miRNAs are found to be involved in pathogenesis of a
ment methods, the prognosis of cervical cancer patients broad-spectrum of human diseases, including cancer and
still remains poor [1]–[3]. Therefore, exploring the molecular inflammation and other pathophysiological networks [4],
mechanisms related to tumor progression of cervical cancer [9], [10]. Several genes and pathways are involved in various
is very important and might provide novel biomarkers to regulation mechanism of cancer. However, the regulation
predict and improve the prognosis of patients, which ulti- of genes by miRNAs has drawn particular attention [11].
mately would lead to better therapeutics. Identification of Experimental studies for finding relationship between miR-
microRNA (miRNA) - messenger RNA (mRNA) regulatory NAs and their mRNA target are very difficult. The most
modules (MMRMs) will aid in deciphering the aberrant frequent procedures are based on seed sequence comple-
transcriptional regulatory networks in cervical cancer and mentarity, evolutionary conservation and thermodynamic
stability [11], [12]. Hence, to overcome this known level of
• The authors are with Department of Bioscience and Bioengineering, Indian complexity, computational predictions of putative miRNA-
Institute of Technology, Jodhpur, India. mRNA targets has emerged as a complementary approach
E-mail: {sushmitapaul, madhumita.1}@iitj.ac.in to facilitate the experimental characterization of relevant
miRNAs. However, the rate of number of false positives

1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 2

obtained from computational prediction may increase and 2 DATA S ETS U SED
that requires additional steps to remove them. In order to The miRNA and mRNA expression of cervical cancer
regulate the targeted genes, miRNAs bind to 3 prime UTR data sets are downloaded from TCGA (https://tcga-
regions of those genes, keeping this in view several meth- data.nci.nih.gov/tcga/). The mRNA expression dataset,
ods are developed focusing on identification of conserved TCGA CESC exp HiSeqV2 (2015-02-24) contains expres-
sequence regions between miRNAs and mRNAs [13]. How- sion of 20,130 genes in 308 samples and the miRNA ex-
ever, sequence-based approaches cannot identify functional pression dataset, TCGA CESC miRNA-HiSeq (2015-02-24)
changes in genes and also generates many false positive contains expression of 333 miRNAs in same set of 308
bindings sites. Integrated miRNA and related mRNA analy- samples. For incorporating miRNA-miRNA synergistic in-
ses in different types of cancer have been the focus of many teraction information pairwise MISIM functional similarity
studies [14]–[20]. To identify potential interactions between of 270 miRNAs involved in cervical cancer are used [31].
miRNAs, mRNAs, and pathways involved in cancer devel-
opment, many studies used large-scale miRNA and mRNA
3 P ROPOSED MI RNA- M RNA M ODULE I DENTIFI -
expression profile datasets [14]–[16].
CATION A LGORITHM
Signaling pathways or biological functions associated
with a specific cancer can suffer from severe significant Various studies have suggested that gene expression in
alteration due to alteration in miRNA-mRNA relationships. both plants and animals are essentially regulated by one or
With the advances in high throughput technologies, large- sometimes more than one miRNA. miRNAs target mRNAs
scale availability of miRNA and mRNA expression datasets to regulate biological processes. In gene regulatory networks
from the same set of patients have become available due both miRNAs and mRNAs interact. Studying combined
to collaborative efforts such as The Cancer Genome Atlas relationships between groups of miRNAs and groups of
(TCGA) project [21], [22]. Proper inspection and biological mRNAs reveals the in-depth information about gene reg-
analysis of the expression data of both types of biomarkers, ulation and cell functions, rather than studying individual
namely, miRNA and mRNA can be helpful in the iden- miRNA and mRNA relationships. Various studies have been
tification of disease-associated miRNA-mRNA regulatory conducted to identify MMRMs with a common aim to eluci-
modules. date miRNA-mRNA regulatory relationships. However, the
Although, a few algorithms for finding miRNA-mRNA combined relationships between a group of miRNAs and
modules have been proposed, improvements are still the group of targeted mRNAs in the process of discovering
needed. Most of these approaches need data integration at MMRMs are not considered by existing methods. Accord-
several steps in the form of gene-gene interactions, miRNA- ingly, a method is required that can discover a regulatory
gene interactions and transcription factor-gene interactions module containing multiple miRNAs and mRNAs. In this
derived from information based databases. Mirsynergy [23] paper, a mutual information based approach is proposed to
uses TargetScanHuman 6.2 [24] for miRNA-target site ma- identify regulatory modules containing multiple miRNAs
trix and TRANSFAC [25] and BioGrid [26] for gene-gene and mRNAs both sharing certain level of relationships.
interactions including transcription factors and protein- Diagrammatic representation of the work-flow of proposed
protein interactions, respectively. Similarly, SNMNMF [14] algorithm RF CM 3 is shown in Figure 1.
uses MicroCosm website (http://www.ebi.ac.uk/enright-
srv/microcosm/htdocs/targets/v5/) for miRNA-gene in- 3.1 Relevant and Functionally Consistent miRNA-
teractions, TRANSFAC [25] for DNA-protein interactions mRNA Module
and combines protein-protein interaction data obtained Let X = {X1 , · · · , Xi , · · · , Xj , · · · , Xp } denotes a set of
from Bossi and Lehner (2009) [27]. MAGIA [28] is a web- miRNAs and Y = {Y1 , · · · , Yi , · · · , Yj , · · · , Yq } denotes a
based tool, which also uses TargetScan/PicTar/PITA [24], set of mRNAs such that X ∩ Y = φ. Given, DX and DY
[29], [30] database for incorporating miRNA-target site in- datasets, having n matching miRNA and mRNA expression
teractions in it's different algorithms. The above mentioned samples, the objective of current study is to identify X∗ ⊂ X
approaches have high time complexity as well as they get and Y∗ ⊂ Y, such that X∗ and Y∗ are related, as a result
executed in multiple steps rather than in automatic fashion. miRNAs X∗ collaboratively interacting with mRNAs in Y∗
In this regard, this paper presents a mutual information and vice versa.
based method, termed as relevant and functionally con- For discovery of miRNA-mRNA regulatory modules
sistent miRNA-mRNA module (RF CM 3 ) for selection of a two stage approach namely, Relevant and Functionally
important MMRMs in cervical cancer. The proposed method Consistent miRNA-mRNA Module (RF CM 3 ) is proposed
selects regulatory modules in an automatic manner which in this paper. Two measures, a mutual information based
requires less time. Module selection is done by maximiz- algorithm and miRNA-miRNA synergistic interaction, are
ing relatedness between miRNA and genes as well as by incorporated in the two stages, respectively. Next, the mu-
maximizing functional similarity between the genes of that tual information based approach is described first and then
module. Later miRNA functional similarity matrix (MISIM the miRNA-miRNA synergistic interaction information and
data for cervical cancer [31]) is used. This helps the pro- its integration is presented in the current framework.
posed algorithm in producing significant modules having
multiple miRNAs and mRNAs. Implementation of the pro- 3.1.1 Mutual Information for Identifying Relationships Be-
posed algorithm could help in identification of functional tween Multiple mRNAs and a miRNA
relationship between these biomarkers, which will unravel This section describes about the method that is used to
key mechanisms involved in cervical cancer pathogenesis. generate a star shaped module containing single miRNA

1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 3

mutual information is used [32].


The relevance fˆ(Yi , X) of a mRNA Yi with respect to the
class label or miRNA X using mutual information can be
computed as follows:

fˆ(Yi , X) = I(Yi , X), (1)


RFCM3 where, I(Yi , X) represents the mutual information be-
Relatedness
tween attribute/mRNA Yi and miRNA or class label X that
Functional
Similarity is given by

I(Yi , X) = H(Yi ) − H(Yi | X). (2)


Here, H(Yi ) and H(Yi | X) represent the entropy of
mRNA Yi and the conditional entropy of Yi given class label
X, respectively. The entropy is a measure of uncertainty.
Similarly, functional similarity between two mRNAs
(Yi ) and (Yj ) can be computed by calculating mutual
information between them as given below

I(Yi , Yj ) = H(Yi ) − H(Yi | Yj ). (3)


The total relevance of all selected mRNAs and total func-
tional similarity among the selected mRNAs are, therefore,
given by
fˆ(Yi , X) f˜(Yi , Yj ).
X X
Jrelev = Jsimi =
Yi ∈Θ Yi 6=Yj ∈Θ
Fig. 1. Diagrammatic representation of RF CM 3 work-flow. Given the in- (4)
puts of same sampled miRNA and mRNA expression profiles RF CM 3
first generates star shaped modules containing single miRNA and mul- For identification of miRNA-mRNA module, first of all, a
tiple mRNAs. It selects miRNAs by maximizing relatedness between decision matrix is created for each miRNA. The decision
miRNA and mRNA also funtional similarity between the mRNAs of the matrix contains gene or mRNA as conditional attributes
module. RF CM 3 uses mutual information for identifying relationships
between multiple mRNAs and a miRNA. At, later stage in order to
and miRNA as class label. The rows are samples. The
remove insignificant modules and to merge the significant one, miRNA- RF CM 3 algorithm is implemented on each decision matrix
miRNA synergistic interaction information is incorporated. Modules are for identification of genes or mRNAs that are associated
merged rigorously considering their miRNA-miRNA similarity, where with that particular miRNA.
most of the modules are functionally enriched. Finally, multiple modules
are generated containing more than one miRNA and multiple genes.
Algorithm : Mutual Information for Identifying Rela-
and multiple mRNAs. Where, each mRNA in a module is tionships Between Multiple mRNAs and a miRNA
related to the corresponding miRNA and the mRNAs of that
module are simultaneously functionally similar. Provided
the matrices of miRNA expression and gene expression a 1) Let C ← {Y1 , · · · , Yi , · · · , Yj , · · · , Ym }, Θ ← ∅.
decision matrix is created first. A decision matrix contains 2) Calculate relevance fˆ(Yi , X) of each mRNA Yi ∈ C
a class label attribute and conditional attributes. Here, the with respect to class label or miRNA X.
expression values of each miRNA are discretized and later 3) Select the mRNA Yi as the most relevant mRNA
used as class label. All the expression values of genes are that has highest value fˆ(Yi , X). In effect, Yi ∈ Θ
considered as features or conditional attributes. The rows and C = C \ Yi .
represent samples. Therefore, total 333 decision matrices 4) Repeat the following two steps until desired num-
are created each having dimension of 308 rows and 20,130 ber of mRNAs is selected for a module.
columns and one class label. For each miRNA, a set of 5) If f˜(Yi , Yj ) < δ , remove Yj from C.
mRNAs is selected by implementing RF CM 3 algorithm. 6) From the remaining mRNAs of C, select gene Yj
Next, the RF CM 3 algorithm is described. that maximizes the following condition:
The RF CM 3 algorithm selects a set of mRNAs Θ from 1 X ˜
a given data set C = {Y1 , · · · , Yi , · · · , Yj , · · · , Yp } of p 0.5 ∗ fˆ(Yj , X) + 0.5 ∗ f (Yi , Yj ). (5)
|Θ| Y ∈Θ
mRNAs. Relevance of a mRNA quantifies the correlation i

of the mRNA with respect to class label or miRNA. Also, As a result of that, Yj ∈ Θ and C = C \ Yj .
it infers about the dependency of the class label X on
an attribute. Here, the relevance of the mRNA Yi with
respect to class labels /miRNAs X is defined as fˆ(Yi , X). Mutual information is used to compute both relevance and
Whereas, f˜(Yi , Yj ) is defined as the functional similarity of functional similarity of a mRNA. The relevance and func-
the mRNA Yj with respect to the mRNA Yi . In this study, tional similarity of a mRNA are calculated using (1) and (3),
for calculation of both relevance and functional similarity respectively.

1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 4

The expression values of both miRNA and mRNA in The major metrices for evaluating the performance of
the data are continuous in nature. Continuous expression different algorithms are functional consistency, functional
values of a miRNA and mRNA are needed to be discretized enrichment, and survival analysis. The description of
for calculation of relevance of a mRNA with respect to each metric is provided where they are introduced
miRNA or clinical outcome using mutual information. The in the manuscript. Literature study of some of the
marginal probabilities and the joint probability are com- interactions are also presented. The proposed RFCM3
puted using discretized expression values of a mRNA and algorithm is implemented in C language and run in
miRNA. These probabilities are later used to compute the LINUX environment having machine configuration
mRNA-class/miRNA relevance. Therefore, discretization of iCore i7-6700CPU, 3.6 GHz, 6 MB cache, and 4 GB
continuous valued miRNAs and mRNAs is a very vital RAM. The source code of the proposed algorithm
step in the current study. In the current study discretization and the supplementary information are available at
method mentioned in [33] is used. This method discretizes http://home.iitj.ac.in/∼sushmitapaul/CBL/softwares.html.
expression values of a miRNA and mRNA using mean µ
and standard deviation σ that are computed over n expres- 4.1 Optimum Value of Threshold δ
sion values of that particular miRNAs or mRNA. Next, the
values bigger than (µ + σ) is represented as 1, the values The proposed algorithm iteratively computes functional
between (µ−σ) and (µ+σ) as 0 and the values smaller than similarity of each mRNA with respect to already selected
(µ − σ) as −1. The over-expression, baseline, and under- mRNAs and selects next mRNA for the module. If a mRNA
expression of the miRNAs or mRNAs correspond to these has functional similarity value with already selected mRNA
three values. lesser than δ then it is not considered further. The value
of δ has been varied from 0.1 to 0.5. It was observed
3.1.2 Incorporation of miRNA-miRNA Synergistic Interac- that at 0.15 the algorithm could generate modules having
tion Information maximum KPES (Kegg Pathway Enrichment Score) value
as shown in Figure 2. At this value of δ the algorithm was
From the expression data of 20,130 genes and 333 miRNAs
able to generate maximum number of modules which were
in 308 samples of cervical cancer, 333 miRNAs-mRNAs reg-
enriched by significant KEGG pathway terms. Therefore, the
ulatory modules were generated by maximizing relatedness
optimal value of δ was set to 0.15. KPES is used here to
between miRNAs and genes as well as by maximizing func-
evaluate the functional consistency of the modules. Higher
tional similarity between the genes, each module contained
the value of KPES, more significant are the modules. It is
one miRNA and a maximum of fifty interacting genes.
calculated by multiplying total number of enriched modules
Further to find interaction between miRNAs of different
(M) to the sum of -log10 (FDR), of all significant KEGG
star shaped modules miRNA similarity matrix (MISIM) [31]
pathway terms (K) divided by total number of significant
information was incorporated in the algorithm. This helped
KEGG pathway terms. FDR (false discovery rate) is the ratio
in generation of such modules which are highly correlated
of the number of false positive results to the number of total
to cervical cancer and contained more than one miRNA. The
positive test results (expected proportion of type I error ).
MISIM matrix is generated based on the concept that genes
In the current study DAVID annotation tool [22], [36] was
with similar functions are often associated with similar
used to do enrichment analysis. Following is the expression
diseases, and the relationship of different diseases can be
of KPES:
represented by a structure of directed acyclic graph (DAG)
that is also true for miRNA. Therefore, it is feasible to infer MX
K
miRNA functional similarity by measuring the similarity KP ES = −log10 (F DRk ) (6)
K k
of their associated disease DAG. Moreover, using MISIM
information insignificant modules were also dropped from
further analyses. The MISIM is a matrix containing miRNA-
4.2 Optimum Value of MISIM Threshold
miRNA interaction similarity scores. The miRNAs in the
matrix are related to cervical cancer and the score value is The miRNA-miRNA similarity value in MISIM matrix
normalized between 0 to 1, where 0 means no interaction varies from 0 to 1. Also, choice of cut-off value plays
and 1 means highly interactive. an important role to fuse modules. Hence, it has a direct
influence on the performance of proposed algorithm. Here,
in this study a range of score values (0.7 to 1) was selected as
4 E XPERIMENTAL S ETUP a cut off for merging multiple miRNA-mRNA modules gen-
In the present research work, the performance of the erated by the RF CM 3 . Those modules were fused whose
proposed relevant and functionally consistent miRNA- miRNAs shared higher interaction similarity score than
mRNA module (RF CM 3 ) algorithm has been compared the cut-off value. As a result of that insignificant modules
with that of Mirsynergy [23], SNMNMF [14], Weighted were filtered out having low interaction similarity scores.
correlation network analysis (WGCNA) [34], and MAGIA Finally, multiple modules were generated containing more
(PITA algorithm) [28]. STRING database [21] was used than one miRNA and multiple genes. Table 1 represents the
to generate gene-gene interaction (GGI) data matrix H number of modules generated at different MISIM cut-off
and gene-gene interaction network for Mirsynergy and score values. The miRNA-mRNA networks were generated
SNMNMF, respectively. TransmiR database [35] was used using Cytoscape version 3.6.1 [37].
to generate miRNA-target site matrix W and miRNA-gene With the functional enrichment analysis later the opti-
interactions for Mirsynergy and SNMNMF, respectively. mum value of cut-off score was decided to be > = 0.7, as

1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 5

are also regulated by more than one miRNA.

Fig. 2. Variation in KPES for different values of threshold δ

TABLE 1 Fig. 3. miRNA interactions across all modules at cut-off > = 0.7 gener-
Modules Description at Different Cut-off Values ated by RF CM 3

Cut-off value M EM miR mR KPES


0.7 22 14 92 942 22.7
0.8 22 12 60 718 22.3
0.9 13 9 38 592 16.6
1.0 10 6 29 415 22.5

Note:- M:Total number of modules generated; EM:Number of


functionally enriched module; miR:Total number of miRNAs present
in all the modules; mR:Total number of mRNAs present in all the
modules; KPES:KEGG Pathway enrichment score.

from Table 1 it is seen that at this cut-off value KPES is


maximum with maximum number of miRNAs and mRNAs
having 63.63% of enriched modules. The 22 modules gen-
erated at this cut-off value were found to be highly related
to cervical cancer processes. As shown in Table 1, KPES is
maximum at the cut-off score of > = 0.7.
Figure 3 represents interaction between 92 miRNAs Fig. 4. miRNA-mRNA regulatory interactions of the most significant
across 22 modules, which were selected to be interacting module generated by RF CM 3
in MISIM data at the cut-off > = 0.7. Further on this
basis the 333 modules developed by the proposed RFCM3
algorithm were merged together to generate final 22 mod-
ules containing the interacting miRNAs and their respective 4.3 Comparison of Module Sizes and Connectivities
interacting genes that are associated to cervical cancer. Few At optimum value of MISIM cut-off score, the RF CM 3
of the modules of 333 modules were dropped off from algorithm generated 22 modules from the cervical cancer
further analysis. After the functional enrichment of all the dataset. Whereas, Mirsynergy identified more modules than
22 modules generated by the proposed method, the most RF CM 3 . However, on comparison with the modules gener-
significant module is decided on the basis of number of ated by Mirsynergy, the modules identified by the RF CM 3
significant pathway terms associated with the miRNA and were clearly more densely connected and perhaps are more
mRNA of the respective module. Significance of the path- consistent with the intricacy of the underlying biological
way terms is decided on the basis of its relatedness with network. The k value for running SNMNMF was set to 50
HPV infection pathway (hsa05165) or any other cervical so the algorithm identified 50 modules. Out of 50 modules
cancer related pathways having p-value less than 0.05. 6 modules were empty, 46 modules were devoid of any
Higher the number of significant pathway terms associated miRNA (miRNA empty), and 6 modules were containing
with the module, more significant is the module. Figure 4 only miRNAs without any mRNA (mRNA empty). Only 4
represents one of the most significant modules having 12 of the modules contained both miRNA and mRNA, making
miRNA and 242 genes, miRNA-miRNA as well as miRNA- star shaped modules as each of these module contained
mRNA interactions are clearly visible and some of the genes 1 miRNA and 9, 13, 2, 4 mRNAs, respectively. On the

1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 6

other hand, WGCNA and MAGIA generated 4 and 18 mod- TABLE 2


ules, respectively. Table 2 represents the details of modules Performance Summary of RFCM3 , Mirsynergy, SNMNMF, WGCNA and
MAGIA Web Tool
fetched by different methods in terms of number of module,
average number of miRNAs and genes per module, per- Method M Avg.miR Avg.mR PPI KPES Time (ms)
Mirsynergy 37.0 4.2 9.1 72.9 -10.30 3.74E+06
centage of modules having protein-protein interaction (PPI) SNMNMF 4.0 1.0 7.0 25.0 -2.85 2.46E+06
enrichment P -value < 0.05, and KPES score. The average WGCNA
MAGIA
4.0
18.0
83.7
1.0
216.2
1.7
0.0
0.0
-6.50
NIL
2.31E+06
1.40E+05
number of miRNAs per module in the RFCM3 and Mirsyn- RF CM 3 22.0 4.1 81.2 77.2 22.70 2.32E+06

ergy are very close, all the modules generated by both the
Note:- M:module number; Avg.miR and Avg.mR:average miRNA and mRNA
methods have two or more than two miRNAs. Moreover, the per module; PPI:Percentage of modules having PPI enrichment p-value<0.05;
maximum number of miRNAs in a single module generated KPES:KEGG Pathway enrichment score; Time:number of millisecond.
by RFCM3 is 14. In contrast, one of the modules obtained
by Mirsynergy contained maximum of 13 miRNAs. There
is huge difference between the average number of genes
per module in the RFCM3 and Mirsynergy. Notably, there
are considerably fewer genes in the Mirsynergy modules
than in modules identified by RFCM3 , which shows a
denser network generated by the RFCM3 responsible for
specific biological process or pathway, which is supported
by the functional enrichment analysis described later in the
manuscript.
To further illustrate the interactions among the selected
set of genes in a module, protein-protein interaction net-
works (PPIN) were built up with the results generated by
the STRING database [21]. It builds a connection between
two genes/protein based on the strong sequence based
similarity, a very frequent co-occurrence in documents, and
similar results in experiments between two proteins. In
this study, experiment based information has been used
to create connection between two proteins/genes. Finally, Fig. 5. PPINs generated by STRING for the most significant
module of RF CM 3 (p-value < 1.0E-16)
it generates a network and its significance is assessed by
P -value < 0.05. The percentage of modules that generated
significant PPI networks are also compared. From the table
it is seen that the proposed RFCM3 algorithm generates
more significant PPI networks compared to other methods.
It indicates that the genes present in most of the modules
of the RFCM3 were strongly related to each other and carry
out a specific biological function in cervical cancer. The PPI
network of the most significant module generated by the
RFCM3 and Mirsynergy are presented in Figure 5 and 6,
respectively. As shown in Table 2, KPES is also maximum
in the case of modules generated by RFCM3 (22.7), whereas
Mirsynergy, SNMNMF and WGCNA gave negative values,
which suggests that the other methods could not identify
functionally enriched modules.
As far as the time complexity of different algorithms
is concerned, as shown in Table 2, RF CM 3 cosumes Fig. 6. PPINs generated by STRING for the most significant
less time (2.32E+06 ms) to generate significant modules module of Mirsynergy (p-value < 1.81E-05)
when compared with Mirsynergy (3.74E+06 ms) and SNM-
NMF (2.46E+06 ms), whereas WGCNA (2.31E+06 ms) and
RF CM 3 takes near about same time. Though MAGIA web one of the main cervical cancer pathway, that is, human
tool takes very less time of 1.40E+05 ms but is not able to papilloma virus (HPV) infection pathway (hsa05165) having
generate any significant module. P -value < 0.05. Figure 7 clearly shows that there is huge
difference between the total module generated and enriched
modules in case of Mirsynergy (8/37) compared to RFCM3
4.4 Evaluating Modules by Functional Enrichments (14/22). Although, all the modules generated by WGCNA
For the biological interpretation of modules generated by were enriched by pathway terms, none of them were related
different algorithms, pathway enrichment analysis of the to hsa05165 or any other cervical cancer related pathways.
genes were done using DAVID:Functional Annotation Tool. Similarly for SNMNMF, out of the 4 modules only two
14 out of 22 modules generated by RFCM3 and only 8 out of them were enriched but the terms were not related to
of 37 modules generated by Mirsynergy were found to be hsa05165 or any other cervical cancer related pathways. In
significantly enriched by KEGG pathway terms related to case of MAGIA not a single module out of 18 modules were

1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 7

enriched. Modules generated by MAGIA were not taken Total Vs Enriched Modules
into further consideration because none of the module was 40
Total
Enriched
enriched. Examining enriched pathway terms exclusive to 35
modules generated by RFCM3 revealed several interesting
HPV infection pathway or cervical cancer related terms, 30

e.g. MAPK signaling pathway, Jak-STAT signaling pathway,

Number of Modules
25
p53 signaling pathway, and so forth. Table 3 summarizes
list of significant KEGG pathway terms from modules gen- 20

erated by RF CM 3 and Mirsynergy as a result of gene 15


pathway enrichment analysis. Mirsynergy revealed only 2
such pathway terms, which were not present in the modules 10

generated by RF CM 3 but there were 10 such significant 5


pathway terms that were exclusive to modules generated by
RF CM 3 . Pathview tool [38] is used here for pathway based 0
Mirsynergy SNMNMF WGCNA MAGIA Proposed
data integration and visualization of the genes of different Methods

modules generated by the proposed method. The genes are


mapped on HPV infection pathway (hsa05165). Figure 8 Fig. 7. Comparison of total modules and enriched modules
represents mapped genes of the most significant module
generated by the proposed method. The genes highlighted TABLE 3
in red are those genes from the most significant module Comparative Pathway Enrichment of Genes
which plays important regulatory role in HPV infection.
ID Description RFCM3 Mirsynergy
For further supporting the biological function of the hsa04012 ErbB signaling pathway - 4.00E-02

identified modules, the RF CM 3 algorithm was compared hsa04350


hsa04014
TGF beta signaling pathway
Ras signaling pathway
-
7.24E-04
3.00E-02
-
with Mirsynergy, SNMNMF, and WGCNA on the basis of hsa04664
hsa04151
Fc epsilon RI signaling pathway
PI3K-Akt signaling pathway
1.00E-03
1.88E-05
-
-
number of distinct pathway terms generated by their gene hsa04010 MAPK signaling pathway 3.00E-02 -
hsa04115 p53 signaling pathway 2.00E-03 -
enrichment analysis. Each of the pathway terms generated hsa04630 Jak-STAT signaling pathway 2.00E-02 -
hsa04510 Focal Adhesion 2.28E-08 -
by all of the modules of a specific method were plotted hsa04612 Antigen processing and presentation 4.52E-17 -
hsa03050 Proteasome 5.00E-02 -
against its log10(FDR) value. Figure 9 shows that for most hsa05203 Viral carcinogenesis 3.00E-02 -
of the pathways in RF CM 3 the log10(FDR) values fall hsa05200
hsa04666
Pathways in Cancer
Fc gamma R-mediated phagocytosis
1.00E-02
4.00E-03
3.00E-02
2.00E-02
in the negative region whereas in the case of Mirsynergy, hsa04370 VEGF signaling pathway 4.00E-03 5.00E-03

WGCNA, and SNMNMF they lie in the positive region.


This clearly indicates that the chance occurance of pathway
terms of RF CM 3 modules are very less but there is possi- their robustness that are frequently predictive of a patients
bility of chance occurrence of pathway terms in the case of clinical outcome. Gene sets from the modules generated by
Mirsynergy, SNMNMF, and WGCNA. To find the significant other methods like SNMNMF, WGCNA and MAGIA were
difference between number of distinct significant pathway not considered for this analysis because none of the modules
terms generated by RF CM 3 , Mirsynergy, SNMNMF, and generated by them were significant as found by mRNA
WGCNA one-sided Wilcoxon signed rank test was done, pathway enrichment analysis. In view of this, for checking
which shows RF CM 3 produced significantly higher num- predictive capability of selected genes survival analysis was
ber of distinct pathway terms than Mirsynergy (P -value = performed. SurvExpress tool [40] was used for conducting
8.18E-03), SNMNMF (P -value = 2.09E-01), and WGCNA survival analysis. The corresponding clinical information
(P -value = 8.76E-04). of CESC-TCGA cervical squamous cell carcinoma and en-
miRNA pathway enrichment analysis was also done by docervical adenocarcinoma data was selected as relevent
using mirPath [39] web server for the modules generated cervical cancer patient samples available at SurvExpress
by the RF CM 3 and Mirsynergy. All the modules in both tool, which also reflected patients suitable for evaluating
the cases were enriched by significant pathway terms (P - the biomarkers in this test. The samples were bifurcated
value < 0.05). Three of the significant pathway terms related
to cervical cancer, namely, Ras signaling pathway, FoXo
signaling pathway, and Proteasome were exclusive to the
TABLE 4
modules generated by RF CM 3 whereas, other pathway Comparative Pathway Enrichment of miRNAs
terms like Endometrial Cancer, p53 signaling pathway, Wnt
signaling pathway, and other nine terms were common to ID Description RFCM3 Mirsynergy
hsa04014 Ras signaling pathway 5.00E-03 -
both the methods. Table 4 summarizes list of significant hsa04068 FoXo signaling pathway 1.00E-02 -

KEGG pathway terms from modules generated by RF CM 3 hsa03050


hsa04115
Proteasome
p53 signaling pathway
1.00E-02
7.05E-19
-
3.29E-10
hsa04012 ErbB signaling pathway 7.73E-16 9.67E-24
and Mirsynergy as a result of miRNA pathway enrichment hsa04151 PI3K-Akt signaling pathway 2.60E-19 2.02E-21
analysis. hsa05213
hsa04370
Endometrial Cancer
VEGF signaling pathway
3.00E-05
2.00E-03
1.77E-09
1.26E-05
hsa04666 Fc gamma R-mediated phagocytosis 1.16E-05 1.02E-05
hsa04510 Focal Adhesion 3.92E-23 3.92E-23
hsa04520 Adherens junction 2.18E-06 2.07E-10
4.5 Prognostic Value of Modules in Cervical Cancer hsa04150 mTOR signaling pathway 2.72E-12 3.7E-12
hsa04530 Tight junction 2.5E-16 2.00E-04
Gene signatures of most significant modules generated by hsa04310
hsa04010
Wnt signaling pathway
MAPK signaling pathway
3.72E-24
4.20E-10
3.92E-23
2.58E-16
Mirsynergy and RF CM 3 were evaluated for understanding

1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 8

Fig. 8. The human papillomavirus infection pathway (hsa05165) rendered by Pathview

4.6 Biological Significance Analysis of miRNA-Target


120 Functional Enrichment Comparison Interactions
Mirsynergy
SNMNMF This section presents experimental support for the miRNA-
100 WGCNA mRNA interactions, obtained by the proposed RF CM 3
Proposed
algorithm from the cervical cancer dataset. Massive liter-
Number of Pathways

80
ature search along with the help of MiRTarBase database
(http://miRTarBase.mbc.nctu.edu.tw/) which contains ex-
60
perimentally validated miRNA-target interactions (MTIs) is
used here to biologically validate the miRNA-mRNA inter-
40
actions obtained by RF CM 3 algorithm. miRTarBase serves
as an important repository for experimentally validated
20
MTIs, which are frequently updated by manually surveying
research articles. Among all the miRNA-mRNA interactions
0
−10 −8 −6 −4 −2 0 2
log10(FDR) obtained by RF CM 3 algorithm at the optimum cut-off
value of 0.7, many of the MTIs are found to be biologically
relevent in different studies. Some of the interactions from
Fig. 9. Significant pathways generated by Mirsynergy, SNMNMF,
WGCNA, and RF CM 3 different modules are discussed further.
• As suggested in [41], the inflammatory molecules
may be up-regulated in human intracranial
aneurysms as a response to decrease in regulatory
into high risk group (in red curve) and low risk group (in miRNA like hsa-mir-204, which is down regulated
green curve) taking median of average of gene expression in such tissues and are validated to target CCR5.
values. From the Kaplan-Meier plot (Figure 10), it was found • With the use of the miRNA-target screening system
that the gene set selected by both the proposed method composed of a self-assembled cell microarray (SAM-
and Mirsynergy were significantly able to separate two cell), hsa-miR-204 was identified to regulate Nox2
risk groups. From both the figures, it can be deduced that (CYBB) expression and it's downstream products in
the proposed method outperformed Mirsynergy in terms both human and mouse macrophages [42].
of hazard ratio, p-value of hazard ratio, log-rank test and • Elucidation of transcriptome-wide microRNA bind-
concordance index. Therefore, the results indicate that the ing sites in human cardiac tissues by Ago2 HITS-
selected genes from the module of the RF CM 3 are highly CLIP revealed that hsa-mir-296 targets RPRD2 [43].
correlated with patient survival. Hence they may be consid- • Interaction of miRNA hsa-mir-106a with three of the
ered as potential prognostic markers. genes namely REST, ATXN7L3B, and NR2C2 is sup-

1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 9

(a) Mirsynergy (b) RF CM 3

Fig. 10. Comparison of Kaplan-Meier curves for patients with cervical cancer plotted for combined expression of 20 genes obtained by the
Mirsynergy and 242 genes obtained by the RF CM 3

ported by photoactivatable-ribonucleoside-enhanced ated with human AGO1, reported data sets of more
cross-linking and immunoprecipitation (PAR-CLIP) than 18,000 high-confidence miRNA-mRNA interac-
method [44], [45] and with SMG1 is supported by tions, which also includes some of the interactions
mapping the human miRNA interactome by CLASH obtained by the proposed method, like interaction
[46]. between hsa-mir-222 with three of the genes namely
• Elucidation of trancriptome-wide microRNA bind- RPL8, RPL12 and RPS2 also interaction between hsa-
ing sites in human cardiac tissues by Ago2 HITS- mir-221 with two of the genes namely RPS24 and
CLIP revealed that hsa-mir-106b targets PRKCB [43]. RPLP0.
• TP53-mediated regulation of AGO2-miRNA interac- In addition with the above discussed interactions other
tion represents a mechanism of miRNA regulation such validated miRNA-gene interactions are also enlisted in
in carcinogenesis, which indicates hsa-mir-93 targets Table 4 along with the experimental methods used for their
TSKU [47]. validation, miRTarBase ID and suitable references.
• The results from the study [48], suggests that hsa- Also, massive literature search showed that all the 12
let-7g may suppress hepatocellular carcinoma (HCC) miRNAs present in the most significant module generated
metastasis partially through targeting COL1A2, by RF CM 3 were found to regulate their target genes in
moreover addition of COL1A2 counteracts the in- certain type of cancer, out of which 7 of them were found
hibitory effect of hsa-let-7g. to regulate their target genes specifically in cervical cancer.
• Mapping the human miRNA interactome by CLASH Table 6 represents the detailed information about these 12
reveals that hsa-let-7b targets KLHL11 and SMG1 miRNAs, their target genes in specific cancer and type of
by noncanonical binding [46]. regulation along with the PubMed Id (PMID) of the study.
• In the study [49], impact of hsa-mir-34a was ex-
amined on pro-apoptotic/anti-apoptotic gene using 5 C ONCLUSION AND F UTURE D IRECTION
PCR array revealed BIRC6 as one of its target.
• Random or site-specific incorporation of photoacti- By representing the molecular interactions underlying bi-
vatable nucleoside analogs into RNA in vitro has ological processes, network biology paves the path to
been used to probe RBP and RNP-RNA interactions drug discovery, better understanding of disease mechanism
along with in vivo crosslinking of RNA-protein com- and cancer therapeutics [51]. miRNAs are mostly 21-23
plex, which were later isolated by immunoprecipita- nucleotide-long non-coding RNAs, these biological units
tion in the study [47], helped to validate interactions with low complexity, easy detection and high stability can
between hsa-mir-20a and RACGAP1, hsa-mir-107 regulate gene expression at translational level and also by
and CCNT1, hsa-mir-144 and ATXN1L and hsa-mir- mRNA degradation [6], [7]. miRNAs play fundamental roles
130b and ATEN1. in differentiation and development. They are also involved
• By the help of PAR-CLIP technology, which allows in biological mechanisms underlying tumorigenesis which
the direct and transcriptome-wide identification of makes them promising biomarkers for several types of
miRNA targets, the target sites for all the viral and cancers. Discovery and accurate characterization of miRNA-
cellular miRNAs expressed in PEL cell lines in the mRNA regulatory modules in cervical cancer has became
study [50] revealed the interaction between hsa-mir- feasible due to the availability of miRNA-miRNA functional
107 and ERN1. similarity and miRNA and mRNA expression profiles from
• The technique applied in the study [46] for ligation the same patients.
and sequencing of miRNA-target duplexes associ- In this article, an algorithm has been proposed; namely,
relevant and functionally consistent miRNA-mRNA module

1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 10

TABLE 5
Experimental validation of some of the miRNA-mRNA interactions obtained by RF CM 3

miRTarBase ID miRNA Target Gene Experiments PMID


MIRT641338 hsa-mir-204 CCR5 HITS-CLIP 23824327
MIRT735086 hsa-mir-204 CYBB ELISA/Immunohistochemistry/qRT-PCR 28235791
MIRT784378 hsa-mir-296 RPRD2 HITS-CLIP 27418678
MIRT048317 hsa-mir-106a SMG1 CLASH 23622248
MIRT213205 hsa-mir-106a REST PAR-CLIP 23446348
MIRT397686 hsa-mir-106a ATXN7L3B PAR-CLIP 23592263
MIRT522196 hsa-mir-106a NR2C2 PAR-CLIP 23446348
MIRT608980 hsa-mir-106b PRKCB HITS-CLIP 24906430
MIRT246961 hsa-mir-93 TSKU PAR-CLIP 20371350
MIRT000472 hsa-let-7g COL1A2 Luciferase reporter assay/Western blot 20338660
MIRT004956 hsa-let-7b BIRC6 qRT-PCR 17942906
MIRT052056 hsa-let-7b KLHL11 CLASH 23622248
MIRT052316 hsa-let-7b SMG1 CLASH 23622248
MIRT259476 hsa-let-7b TXLNG PAR-CLIP 21572407
MIRT047374 hsa-mir-34a BIRC6 CLASH 23622248
MIRT440369 hsa-mir-20a KIF23 HITS-CLIP 22473208
MIRT566145 hsa-mir-20a RACGAP1 PAR-CLIP 20371350
MIRT178476 hsa-mir-107 LCOR PAR-CLIP 21572407
MIRT441505 hsa-mir-107 ERN1 PAR-CLIP 22100165
MIRT564315 hsa-mir-107 CCNT1 PAR-CLIP 20371350
MIRT043432 hsa-mir-331 AURKAIP1 CLASH 23622248
MIRT196621 hsa-mir-301a TAOK1 PAR-CLIP 23592263
MIRT544109 hsa-mir-301a IPMK PAR-CLIP 21572407
MIRT044901 hsa-mir-188 SBNO1 PAR-CLIP 23592263
MIRT494770 hsa-mir-144 AP1G1 PAR-CLIP 21572407
MIRT549248 hsa-mir-144 ATXN1L PAR-CLIP 20371350
MIRT046643 hsa-mir-222 RPL8 CLASH 23622248
MIRT046650 hsa-mir-222 RPL12 CLASH 23622248
MIRT046672 hsa-mir-222 RPS2 CLASH 23622248
MIRT046852 hsa-mir-221 RPS24 CLASH 23622248
MIRT046937 hsa-mir-221 RPLP0 CLASH 23622248
MIRT020247 hsa-mir-130b ATE1 Sequencing 20371350
MIRT196622 hsa-mir-130b TAOK1 PAR-CLIP 23592263
MIRT544110 hsa-mir-130b IPMK PAR-CLIP 21572407

Note:- HITS-CLIP:High-throughput sequencing of RNA isolated by crosslinking immunoprecipitation, PAR-CLIP:photoactivatable ribonucleoside-enhanced crosslink-
ing and immunoprecipitation, ELISA:enzyme-linked immunosorbent assay, qRT-PCR:quantitative real time polymerase chain reaction, CLASH:crosslinking, ligation,
and sequencing of hybrids.

TABLE 6 the modules generated by proposed algorithm were highly


miRNA Expression in Various Studies related to cervical cancer pathway and the proposed method
produced significantly more number of distinct pathway
miRNA Regulation(UP/DOWN) Target genes Cancer type PMID
hsa-mir-106b DOWN DAB2 Cervical 28498390 terms, supporting the functional consistency of the modules.
hsa-mir-135b DOWN FOXO1 Cervical 26617737
hsa-mir-214 UP TP53 Cervical 29023799 Survival analysis also indicated that the selected genes from
hsa-mir-95 UP PTEN lung 25831148
hsa-mir-93 UP RAB11FIP1 Cervical 27279231
the module of the proposed method were highly correlated
hsa-mir-106a
hsa-mir-34b
DOWN
DOWN
FBX031
P53
Breast
Cervical
28500896
26619844
with patient survival and may have better prognostic value.
hsa-mir-34c DOWN P53 Cervical 26619844 Proposed algorithm generated more significant regulatory
hsa-mir-25 DOWN p27 Osteosarcoma 24859599
hsa-mir-191 UP - Cervical 22330141 modules that are highly related to cervical cancer using
hsa-mir-32 UP PHLPP2 Breast 26276160
hsa-mir-132 DOWN ANO1 Colorectal 26868958 integrative approach for automatic detection, which Mirsyn-
ergy and other methods were unable to do. Regulatory
modules obtained by the proposed method may be helpful
(RF CM 3 ), which uses mutual information for identifying for understanding the underlying etiology of the cervical
relationships between multiple mRNAs and a miRNA to cancer. Nonetheless, the performance of RFCM3 is sensitive
generate a star shaped module containing single miRNA to the availability of miRNA-miRNA interaction available
and multiple mRNAs. The algorithm maximizes relatedness for different biological conditions.
between miRNA and genes at the same time maximizes
functional similarity between genes of that module. miR- Despite the fact that miRNAs play crucial roles in
NAs with similar functions are most often associated with regulating multiple targeted genes, which are involved in
similar diseases. This relationship can be represented by a different oncogenic pathways, utmost care is needed for
structure of directed acyclic graph. Based on this DAG a the fact that each gene is also potentially targeted by sev-
miRNA-miRNA Similarity Matrix (MISIM) has been cre- eral miRNAs. Therefore, there is a dire need of deeper
ated. This matrix was used to filter out the insignificant understanding for miRNA-mRNA regulatory biology and
modules. With the functional enrichment analysis it was obvious role of experimental practices to improve fidelity of
found that 22 modules generated at functional similarity computational results. Hence, it is concluded that utilizing
cut-off > = 0.7 was highly related to cervical cancer pro- dual information of miRNAs and mRNAs expression levels
cesses. The performance of the proposed RFCM3 algorithm in cancers can help to discover important findings to iden-
was compared with that of Mirsynergy [23], SNMNMF [14], tify underlying mechanisms and enlighten more molecular
Weighted correlation network analysis (WGCNA) [34], and underpinnings of different cancers. The obtained results
MAGIA (PITA algorithm) [28]. Comparing the module sizes indicate that the identified miRNAs-mRNA covered a wide
and connectivities it was found that the proposed method range of known functions, mainly signaling pathways and
outperforms the other methods. Similarly, functional enrich- biosynthesis implicated in cervical cancer. The proposed
ment analysis also showed that the genes and miRNA of approach could be implemented to other cancer data.

1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 11

ACKNOWLEDGMENT [21] D. Szklarczyk, A. Franceschini, S. Wyde, K. Forslund, D. Helle,


J. Huerta-Cepas, M. Simonovic, A. Roth, A. Santo, K. P. Tsafou,
This work is partially supported by the seed grant program M. Kuhn, P. Bork, L. J. Jensen, and C. V. Mering, “STRING v10:
of the Indian Institute of Technology Jodhpur, India (grant Protein-protein Interaction Networks, Integrated over the Tree of
no. I/SEED/SPU/20160010). The authors thank Prof. Na- Life,” Nucleic Acids Research, vol. 43, no. Database Issue, pp. D447–
D452, 2015.
gasuma Chandra, Indian Institute of Science, Bangalore for [22] D. W. Huang, B. T. Sherman, and R. A. Lempicki, “Systematic and
constructive comments. The authors want to acknowledge Integrative Analysis of Large Gene Lists using DAVID Bioinfor-
Mr. Kuldeep Gujar, Indian Institute of Technology Madras, matics Resources,” Nature Protocols, vol. 4, no. 1, pp. 44–57, 2009.
India for his contribution in implementing certain bioinfor- [23] L. Yue, L. Cheng, W. Ka-Chun, L. Jiawei, and Z. Zhaolei, “Mirsyn-
ergy: Detecting Synergistic miRNA Regulatory Modules by Over-
matics tools. lapping Neighbourhood Expansion,” Bioinformatics, vol. 30, pp.
2627–2635, 2014.
[24] R. C. Friedman, K. K. Farh, C. B. Burge, and D. P. Bartel,
R EFERENCES “Most Mammalian mRNAs are Conserved Targets of MicroR-
NAs,” Genome Research, vol. 9, no. 1, pp. 92–105, 2009.
[1] L. H. Ellenson and T. C. Wu, “Focus on Endometrial and Cervical [25] E. Wingender, X. Chen, R. Hehl, H. Karas, I. Liebich, V. Matys,
Cancer,” Cancer Cell, vol. 5, no. 6, pp. 533–538, 2014. T. Meinhardt, M. Prss, and I. Reuter, “TRANSFAC: An Integrated
[2] H. Yoshikawa, “Progress in the World and Challenges in Japan on System for Gene Expression Regulation,” Nucleic acids research,
HPV Vaccination for Cervical Cancer Prevention,” Gan To Kagaku vol. 28, no. 1, pp. 316–319, 2000.
Ryoho, vol. 37, no. 6, pp. 971–975, 2010.
[26] C. Stark, B. J. Breitkreutz, A. C. Aryamontri, L. Boucher,
[3] R. G. Ghebre, S. Grover, M. J. Xu, L. T. Chuang, and H. Simonds, R. Oughtred, M. S. Livstone, J. Nixon, K. V. Auken, X. Wang, X. Shi,
“Cervical Cancer Control in HIV-infected Women: Past, Present T. Reguly, J. M. Rust, A. Winter, K. Dolinski, , and M. Tyers, “The
and Future,” Gan To Kagaku Ryoho, vol. 21, pp. 101–108, 2017. BioGRID Interaction Database: 2011 update,” Nucleic acids research,
[4] W. P. Kloosterman and R. H. Plasterk, “The Diverse Functions of vol. 39, pp. D698–704, 2011.
MicroRNAs in Animal Development and Disease,” Developmental
[27] A. Bossi and B. Lehner, “Tissue Specificity and the Human Protein
Cell, vol. 11, no. 4, pp. 441–450, 2006.
Interaction Network,” Molecular Systems Biology, vol. 5, no. 260,
[5] A. Huttenhofer and J. Vogel, “Experimental Approaches to Iden-
2009.
tify Non-coding RNAs,” Nucleic Acids Research, vol. 34, pp. 635–
[28] G. Sales, A. Coppe, A. Bisognin, M. Biasiolo, S. Bortoluzzi, and
646, 2006.
C. Romualdi, “MAGIA, a Web-based tool for miRNA and Genes
[6] V. Ambros, “The Functions of Animal MicroRNAs,” Nature, vol.
Integrated Analysis,” Nucleic Acids Research, vol. 38, no. Web
431, no. 7006, pp. 350–355, 2017.
Server issue, pp. W352–9, 2010.
[7] D. P. Bartel, “MicroRNAs: Target Recognition and Regulatory
Functions,” Cell, vol. 136, no. 2, pp. 215–233, 2009. [29] A. Krek, D. Grn, M. N. Poy, R. Wolf, L. Rosenberg, E. J. Epstein,
P. MacMenamin, I. da Piedade, K. C. Gunsalus, M. Stoffel, and
[8] H. K. Saini, A. J. Enright, and S. Griffiths-Jones, “Annotation
N. Rajewsky, “Combinatorial MicroRNA Target Predictions,” Na-
of Mammalian Primary MicroRNAs,” BMC Genomics, vol. 9:564,
ture Genetics, vol. 37, pp. 495–500, 2005.
2008.
[9] R. C. Lee, R. L. Feinbaum, and V. Ambros, “The C. elegans [30] M. Kertesz, N. Iovino, U. Unnerstall, U. Gaul, and E. Segal, “The
Heterochronic Gene lin-4 Encodes Small RNAs with Antisense Role of Site Accessibility in MicroRNA Target Recognition,” Nature
Complementarity to lin-14,” Cell, vol. 75, no. 5, pp. 843–854, 1993. Genetics, vol. 39, pp. 1278–1284, 2007.
[10] B. Wightman, I. Ha, and G. Ruvkun, “Posttranscriptional Regula- [31] D. Wang, J. Wang, M. Lu, F. Song, and Q. Cui, “Inferring the
tion of the Heterochronic Gene lin-14 by lin-4 Mediates Temporal Human MicroRNA Functional Similarity and Functional Network
Pattern Formation in C. elegans,” Cell, vol. 75, no. 5, pp. 855–862, Based on MicroRNA-Associated Diseases,” Bioinformatics, vol. 26,
1993. no. 13, pp. 1644–1650, 2010.
[11] X. Peng, Y. Li, K. Walters, E. Rosenzweig, S. Lederer, L. Aicher, [32] P. Maji, “f-Information Measures for Efficient Selection of Discrim-
S. Proll, and M. Katze, “Computational Identification of Hepatitis inative Genes from Microarray Data,” IEEE Transactions on System,
C Virus Associated Microrna-mrna Regulatory Modules in Hu- Man and Cybernetics, Part C, Applications and Reviews, vol. 56, no. 04,
man Livers,” BMC Genomics, vol. 10, no. 373, 2009. pp. 1063–1069, 2009.
[12] O. Kent and J. Mendell, “A Small Piece in the Cancer Puzzle: [33] P. Maji and S. Paul, “Rough Sets for Selection of Molecular
MicroRNAs as Tumor Suppressors and Oncogenes,” Oncogene, Descriptors to Predict Biological Activity of Molecules,” IEEE
vol. 9:25, no. 46, pp. 6188–6196, 2006. Transactions on System, Man and Cybernetics, Part C, Applications
[13] L. Kannan, M. Ramos, A. Re, N. El-Hachem, Z. Safikhani, D. M. A. and Reviews, vol. 40, no. 06, pp. 639–648, 2010.
Gendoo, S. Davis, D. Gomez-Cabrero, R. Castelo, K. D. Hansen, [34] P. Langfelder and S. Horvath, “WGCNA: An R package for
V. Carey, A. C. C. M. Morgan, B. Haibe-Kains, and L. Waldron, Weighted Correlation Network Analysis,” BMC Bioinformatics,
“Public Data and Open Source Tools for Multi-assay Genomic vol. 9, no. 559, 2009.
Investigation of Disease,” Brief Bioinformatics, vol. 17, no. 4, pp. [35] J. Wang, M. Lu, C. Qiu, and Q. Cui, “TransmiR: A Transcription
603–615, 2015. Factor-microRNA Regulation Database,” Nucleic Acids Research,
[14] S. Zhang, Q. Li, J. Liu, and X. Zhou, “A Novel Computational vol. 38, no. Database Issue, pp. D119–22, 2010.
Framework for Simultaneous Integration of Multiple Types of [36] D. W. Huang, B. T. Sherman, and R. A. Lempicki, “Bioinformatics
Genomic Data to Identify Microrna-gene Regulatory Modules,” Enrichment Tools: Paths Toward the Comprehensive Functional
Bioinformatics, vol. 27, no. 13, pp. i401–9, 2011. Analysis of Large Gene lists,” Nucleic Acids Research, vol. 37, no. 1,
[15] T. C. G. A. R. Network, “Comprehensive Genomic Characteriza- pp. 1–13, 2009.
tion Defines Human Glioblastoma Genes and Core Pathways,” [37] P. Shannon, A. Markiel, O. Ozier, N. S. Baliga, J. T. Wang,
Nature, vol. 455, no. 7216, pp. 1061–1068, 2008. D. Ramage, N. Amin, B. Schwikowski, and T. Ideker, “Cytoscape:
[16] P. S. Chen, J. L. Su, and M. C. Hung, “Dysregulation of MicroRNAs a Software Environment for Integrated Models of Biomolecular
in Cancer,” Journal of Biomedical Science, vol. 19:90, no. 1, 2012. Interaction Networks,” Genome Research, vol. 13, no. 11, pp. 2498–
[17] D. Jin and H. Lee, “A Computational Approach to Identifying 504, 2003.
Gene-microRNA Modules in Cancer,” PLoS Computational Biology, [38] W. Luo and C.Brouwer, “Pathview: an R/Bioconductor Package
vol. 11, no. 1, p. e1004042, 2015. for Pathway-based Data Integration and Visualization,” Bioinfor-
[18] B. P. Lewis, C. B. Burge, and D. P. Bartel, “Conserved Seed Pairing, matics, vol. 29, no. 14, pp. 1830–1831, 2013.
Often Flanked by Adenosines, Indicates that Thousands of Human [39] I. S. Vlachosa, N. Kostoulas, T. Vergoulis, G. Georgakilas,
Genes are MicroRNA Targets,” Cell, vol. 120, pp. 15–20, 2004. M. Reczko, M. Maragkakis, M. D. Paraskevopoulou, K. Prionidis,
[19] C. G. A. R. Network, “Integrated Genomic Analyses of Ovarian T. Dalamagas, and A. G. Hatzigeorgiou, “DIANA miRPath v.2.0:
Carcinoma,” Nature, vol. 474, no. 7353, pp. 769–773, 2005. Investigating The Combinatorial Effect of MicroRNAs in Path-
[20] A. Krek, D. Grn, M. Poy, R. Wolf, L. Rosenberg, J. Epstein, P. Mac- ways,” Nucleic Acids Research, vol. 40, no. Web Server issue, pp.
Menamin, I. da Piedade, C. Gunsalus, M. Stoffel, and N. Rajewsky, W498–504, 2012.
“Combinatorial MicroRNA Target Predictions,” Nature genetics, [40] R. A. Gamboa, H. Gomez-Rueda, E. Martnez-Ledesma,
vol. 37, pp. 495–500, 2005. A. Martnez-Torteya, R. Chacolla-Huaringa, A. Rodriguez-

1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCBB.2019.2910851, IEEE/ACM
Transactions on Computational Biology and Bioinformatics
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018 12

Barrientos, J. G. Tamez-Pena, and V. Trevino, “SurvExpress: An Sushmita Paul Sushmita Paul received the BSc,
Online Biomarker Validation Tool and Database for Cancer Gene MSc, and PhD degrees from University of Ra-
Expression Data Using Survival Analysis,” PLoS One, vol. 8, no. 9, jasthan, Banasthali Vidyapith, and University of
p. e74250, 2013. Calcutta, respectively. After obtaining her PhD
[41] M. Holcomb, Y. H. Ding, D. Dai, R. J. McDonald, J. S. McDonald, degree in January 2014, she was associated
D. F. Kallmes, and R. Kadirvel, “RNA-Sequencing Analysis of with Indian Statistical Institute, Kolkata as Vist-
Messenger RNA/MicroRNA in Rabbit Aneurysm Model Identi- ing Scientist (January 2014 to July 2014). Later,
fies Pathways and Genes of Interest,” American Journal of Neurora- she joined the University Hospital Erlangen, Ger-
diology, vol. 36, no. 9, pp. 1710–1715, 2015. many as a Postdoctoral Research Fellow (Au-
[42] J. Yang, M. E. Brown, H. Zhang, M. Martinez, Z. Zhao, S. Bhutani, gust 2014 to December 2015). She also worked
S. Yin, D. Trac, J. Jeff, and M. E. Davis, “High-throughput Screen- as Scientist at TCS Innovation Labs Kolkata,
ing Identifies MicroRNAs that Target Nox2 and Improve Function India (December 2015 to May 2016). Currently, she is an assistant
after Acute Myocardial Infarction,” American Journal of Physiology- professor in the Department of Bioscience and Bioengineering, Indian
Heart and Circulatory Physiology, vol. 312, no. 5, pp. H1002–H1012, Institute of Technology Jodhpur, India. Her research interests include
2017. computational biology and bioinformatics, pattern recognition, soft com-
[43] R. M. Spengler, X. Zhang, C. Cheng, J. M. McLendon, J. M. Skeie, puting, and so forth. She has published more than 30 papers in in-
F. L. Johnson, B. L. Davidson, and R. L. Boudreau, “Elucidation of ternational journals and conferences, and 2 book chapters. She is a
Transcriptome-wide MicroRNA Binding Sites in Human Cardiac co-author of a book published by Springer-Verlag, London, and also
Tissues by Ago2 HITS-CLIP,” Nucleic Acids Research, vol. 44, no. 15, a reviewer of many international journals. Dr. Paul was a recipient
pp. 7120–31, 2016. of 2017 Early Career Research Award from Science and Engineering
[44] S. Memczak, M. Jens, A. Elefsinioti, F. Torti, J. Krueger, A. Rybak, Research Board, Department of Science and Technology, Government
L. Maier, S. D. Mackowiak, L. H. Gregersen, M. Munschauer, of India, 2017 Bioclues Innovation, Research and Development (BIRD)
A. Loewer, U. Ziebold, M. Landthaler, C. Kocks, F. le Noble, and award, and 2009 Best Paper Award of the International Conference on
N. Rajewsky, “Circular RNAs are a Large Class of Animal RNAs Information Technology from the Orissa Information Technology Society,
with Regulatory Potency,” Nature, vol. 495, no. 7441, pp. 333–8, India.
2013.
[45] A. W. Whisnant, H. P. Bogerd, O. Flores, P. Ho, J. G. Powers,
N. Sharova, M. Stevenson, C. H. Chen, and B. R. Cullen, “In-depth
Analysis of the Interaction of HIV-1 with Cellular MicroRNA Bio-
genesis and Effector Mechanisms,” mBio, vol. 4, no. 2, p. e000193, Madhumita Madhumita received the BSc in
2013. Botany, Zoology and Chemistry and MSc in
[46] A. Helwak, G. Kudla, T. Dudnakova, and D. T. D, “Mapping Life Sciences with specialization in Bioinfor-
the Human miRNA Interactome by CLASH Reveals Frequent matics from Regional Institute of Education,
Noncanonical Binding,” Cell, vol. 153, no. 3, pp. 654–65, 2013. Bhubaneswar, India and Central University of
[47] M. Hafner, M. Landthaler, L. Burger, M. Khorshid, J. Hausser, Punjab, Bathinda, India, respectively. Currently,
P. Berninger, A. Rothballer, M. J. Ascano, A. C. Jungkamp, M. Mun- she is a PhD scholar at Department of Bio-
schauer, A. Ulrich, G. S. Wardle, S. Dewell, M. Zavolan, and science and Bioengineering, Indian Institute of
T. Tuschl, “Transcriptome-wide Identification of RNA-binding Technology Jodhpur, India. Her research inter-
Protein and MicroRNA Target Sites by PAR-CLIP,” Cell, vol. 141, ests include computational biology and bioinfor-
no. 1, pp. 129–41, 2010. matics, pattern recognition, soft computing, and
[48] J. Ji, L. Zhao, A. Budhu, M. Forgues, H. L. Jia, L. X. Qin, Q. H. Ye, so forth. Madhumita is a recipient of Gold Medal from Central University
J. Yu, X. Shi, Z. Y. Tang, and X. W. Wang, “Let-7g Targets Collagen of Punjab, Bathinda, India for outstanding performance during her MSc
Type I Alpha2 and Inhibits Cell Migration in Hepatocellular in the year 2017.
Carcinoma,” Journal of Hepatology, vol. 52, no. 5, pp. 690–7, 2010.
[49] R. A. Yacoub, I. O. Fawzy, R. A. Assal, K. A. Hosny, A. N. Zekri,
G. Esmat, H. M. E. Tayebi, and A. I. Abdelaziz, “miR-34a: Multiple
Opposing Targets and One Destiny in Hepatocellular Carcinoma,”
Journal of Clinical and Translational Hepatology, vol. 4, no. 4, pp. 300–
305, 2016.
[50] E. Gottwein, D. L. Corcoran, N. Mukherjee, R. L. Skalsky,
M. Hafner, J. D. Nusbaum, P. Shamulailatpam, C. L. Love, S. S.
Dave, T. Tuschl, U. Ohler, and B. R. Cullen, “Viral MicroRNA
Targetome of KSHV-infected Primary Effusion Lymphoma Cell
Lines,” Cell Host Microbe, vol. 10, no. 5, pp. 515–26, 2011.
[51] I. Koturbash, F. J. Zemp, I. Pogribny, and O. lgaKovalchuk, “Small
Molecules with Big Effects: The Role of the MicroRNAome in
Cancer and Carcinogenesis,” Mutation Research, vol. 722, no. 2, pp.
94–105, 2011.

1545-5963 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Das könnte Ihnen auch gefallen