Sie sind auf Seite 1von 13

J Mol Model DOI 10.

1007/s00894-011-1018-3

ORIGINAL PAPER

Homology modeling, molecular dynamics, e-pharmacophore mapping and docking study of Chikungunya virus nsP2 protease
Kh. Dhanachandra Singh & Palani Kirubakaran & Shanthi Nagarajan & Sugunadevi Sakkiah & Karthikeyan Muthusamy & Devadasan Velmurgan & Jeyaraman Jeyakanthan

Received: 3 November 2010 / Accepted: 9 February 2011 # Springer-Verlag 2011

Abstract To date, no suitable vaccine or specific antiviral drug is available to treat Chikungunya viral (CHIKV) fever. Hence, it is essential to identify drug candidates that could potentially impede CHIKV infection. Here, we present the development of a homology model of nsP2 protein based on the crystal structure of the nsP2 protein of Venezuelan equine encephalitis virus (VEEV). The protein modeled was optimized using molecular dynamics simulation; the junction peptides of a nonstructural protein complex were then docked in order to investigate the possible protein protein interactions between nsP2 and the proteins cleaved by nsP2. The modeling studies conducted shed light on the
Kh. Dhanachandra Singh, Palani Kirubakaran, and Shanthi Nagarajan contributed equally to this work. Electronic supplementary material The online version of this article (doi:10.1007/s00894-011-1018-3) contains supplementary material, which is available to authorized users. K. D. Singh : P. Kirubakaran : K. Muthusamy : J. Jeyakanthan (*) Department of Bioinformatics , Alagappa University, Karaikudi 63003, India e-mail: jjkanthan@gmail.com S. Nagarajan Bioinformatics Centre, Pondicherry University, Pondicherry 605 014, India S. Sakkiah Bioinformatics Centre and Computational Biology Laboratory, Department of Bioinformatics, Bharathiar University, Coimbatore 641 046, India D. Velmurgan CAS, Biophysics and Crystallography, University of Madras, Chennai 600025, India e-mail: d_velu@yahoo.com

binding modes, and the critical interactions with the peptides provide insight into the chemical features needed to inhibit the CHIK virus infection. Energy-optimized pharmacophore mapping was performed using the junction peptides. Based on the results, we propose the pharmacophore features that must be present in an inhibitor of nsP2 protease. The resulting pharmacophore model contained an aromatic ring, a hydrophobic and three hydrogen-bond donor sites. Using these pharmacophore features, we screened a large public library of compounds (Asinex, Maybridge, TOSLab, Binding Database) to find a potential ligand that could inhibit the nsP2 protein. The compounds that yielded a fitness score of more than 1.0 were further subjected to Glide HTVS and Glide XP. Here, we report the best four compounds based on their docking scores; these compounds have IDs of 27943, 21362, ASN 01107557 and ASN 01541696. We propose that these compounds could bind to the active site of nsP2 protease and inhibit this enzyme. Furthermore, the backbone structural scaffolds of these four lead compounds could serve as building blocks when designing drug-like molecules for the treatment of Chikungunya viral fever. Keywords Chikungunya virus . Homology modeling . Desmond . Nonstructural proteins (nsP) . Molecular dynamics simulation (MDS)

Introduction Chikungunya virus (CHIKV) is a member of the genus Alphavirus of the family Togaviridae. While CHIKV is transmitted to humans by several species of mosquitoes, A. aegypti and A. albopictus are the two main vectors [1]. In

J Mol Model

Africa and Asia, two outbreaks of Chikungunya virus have occurred, with time intervals of 78 years to 20 years between consecutive epidemics [2]. In recent years, emerging and re-emerging tropical infectious diseases have been shown to have high social and economic impacts [3]. The symptoms of CHIKV infection are fever, headache, nausea, vomiting, myalgia, rash, and arthralgia [4]. When infected, patients are observed to have painful puffy feet and ankles, and they experience predominantly chronic polyarthralgia, typically a rheumatoid arthritis-like illness [5]. Recurrent episodes can occur in some patients, generally in the form of chronic and persistent arthralgia. The CHIKV genome consists of a linear, positive-sense, single-stranded RNA molecule of about 11.8 kb [6]. Alphaviruses produce two mRNAs after infection, such as genomic (49 S) RNA, which is translated into nonstructural (replicase) proteins, and subgenomic (26 S) RNA, which serves as mRNA for virion structural proteins. Nonstructural proteins (nsP) such as nsPl, nsP2, nsP3 and nsP4 are formed by proteolytic cleavage of a long polyprotein that includes individual nonstructural proteins [7]. Alphaviruses are enveloped particles, and their genome consists of a single-stranded positive-sense RNA molecule of approximately 12,000 nucleotides. The genome of CHIKV is considered to have the following gene arrangement: 5 cap-nsP1-nsP2-nsP3-nsP4-(junction region)-C-E3E2-6 K-E1-poly(A) 3 [8]. Coding sequences of CHIKV from Indian Ocean patients consisted of two open reading frames (ORFs) encoding the nonstructural proteins (consisting of nsP1, 2, 3 and 4) in the form of two polyprotein precursors in the 5 two-thirds of the genome, and the other one-third containing one ORF encoding the structural polyprotein (consisting of C, E3, E2 and E1) [9]. The alphavirus nsP2 protease domain belongs to the papain superfamily of cysteine proteases. The proteolytic function of Semliki Forest virus (an alphavirus) nsP2 has been mapped to its C-terminal domain [10]. The nsP2 contains the protease domain responsible for nsP maturation [11]; hence, it is clear that the nsP2 is essential for alphavirus replication and exhibits some degree of sequence specificity among alphaviruses [12], making it an attractive target for broad-spectrum antiviral inhibitor development. The nsP2 (799 amino acids) protein from Semliki Forest virus and Sindbis virus (SINV) specifically cleaves the , -triphosphate bond at the 5' end of RNA. This activity is restricted to the N-terminal domain, and the C-terminal domain has no RNA triphosphatase activity [13]. It is necessary to understand the structural aspects of CHIKV nsP2 and other alphavirus nsP2 proteins in order to raise broad-spectrum viral inhibitors. In this report, we explain the structural features of the nsP2 protein, which is modeled based on the crystal structure of Venezuelan equine encephalitis virus (VEEV, PDB ID: 2HWK).

Molecular dynamics simulations (MDs) provide an alternative tool for biological problems that are complementary to experimental techniques [14]. The modeled nsP2 protein structures were refined using MD techniques, and exhaustive docking analysis was performed using induced fit docking [15, 16], which is a robust automated docking method that predicts the conformations of flexible ligands bound to macromolecular targets. Three peptides, (A) Gly1-Ala2-Gly3-Ile4, (B) Gly1-Cys2-Ala3-Pro4 and (C) Gly1-Gly2-Trp3-Ile4, were docked. The plausible docking poses obtained provide a great deal of information about the protein-cleaving mechanism of nsP2 protease. The critical residues in nsP2 were identified by docking three different peptides in order to identify the residues responsible for nonstructural protein cleavage. We propose five different pharmacophore sites based on the receptor peptide interactions. The structural and pharmacophore information can be effectively used for broad-spectrum inhibitor identification, which, Chikungunya virus aside, should prove useful considering that alphavirus family candidates are being used as bioweapons.

Materials and methods All computational analysis was carried out on a Red Hat 5.3 Linux platform running on a Lenovo PC with an Intel Core 2 Duo processor and 2 GB of RAM. Homology model development Homology modeling is a theoretical method that is used to predict the structure of a sequence with an accuracy that is comparable to the best results achieved experimentally. The modeled protein quality is extremely dependent on the identity between the target and template proteins. The nonstructural polyprotein (nsP) of Chikungunya virus (strain S27, African prototype) has 2,470 amino acids and contains four chains, namely nsP1 (1535), nsP2 (5361333), nsP3 (13341863), and nsP4 (18642474). The nsP2 protease is an essential protein whose proteolytic activity is critical for virus replication. On the other hand, experimentally derived protein information is unavailable, so we decided to model the Chikungunya virus nsP2 protease using the structure of its closest homolog that has been solved by crystallography. The nsP2 protein sequence was collected from the Swiss-Prot Protein Database (accession number: Q8JUX6) [8, 17]. A similarity search for nsP2 protease in the Protein Data Bank (http://www.rcsb.org) was performed using the BLAST server [18]. The protein similarity search identified a very similar protein structure belonging to the nsP2 protease of the alphavirus Venezuelan equine encephalitis

J Mol Model

(VEE) [19] (PDB ID: 2HWK), which has 40% sequence identity with CHIKV nsP2, so this structure was used as a template to generate the model. This was the only template available to model the nsP2 protein. The model was generated using Prime (Schrodinger, LLC, New York, USA) [20], and then the energy was minimized using the OPLS (optimized potentials for liquid simulations) 2005 force-field [21]. Model validation The validation of the structure model obtained from Prime was performed by inspecting the psi/phi Ramachandran plot obtained from PROCHECK analysis [22]. The PROSA [23] test was applied to the final model to check energy criteria against the potential of mean force derived from a large set of known protein structures. The root mean square deviation (RMSD) between the main chain atom of the model and the template was calculated by superimposing the structure of the template (2HWK) on the predicted structure of CHIKV nsP2 protease in order to assess the reliability of the model using PyMol [24]. Molecular dynamics simulations All molecular dynamics (MD) simulations were performed using the program Desmond [15], which uses a particular neutral territory method called the midpoint method [25] to efficiently exploit a high degree of computational parallelism. The OPLS 2005 force-field [21] was used to model the aminoacid interactions in the protein, and the SPC (simple point charge) method [26] was used for the water model. Equilibration of the system was carried out using the default protocol provided in Desmond, which consists of a series of restrained minimizations and molecular dynamics simulations that are designed to slowly relax the system without deviating substantially from the initial protein coordinates. The initial coordinates for the MD calculations were taken from the modeled protein. The SPC water molecules were then added (the orthorhombic dimensions of each water box were 10 10 10 approximately, which ensured that the whole surfaces of the complexes were covered), and the system was neutralized by adding Cl counterions to balance the net charge of the system. After the construction of the solvent environment, each complex system was composed of about 34,942 atoms. Before equilibration and the long production MD simulations, the systems were minimized and pre-equilibrated using the default relaxation routine implemented in Desmond. The whole system was subjected to 300 K for 5 ns of simulation, and the final conformations of the modeled protein are presented below. The structural changes and dynamic behavior of the protein were analyzed by calculating the RMSD and energy.

Active site predictions The active site of the modeled protein was investigated using the SiteMap program [27]. This software generates information on the binding sites characteristics using novel search and analytical facilities: a SiteMap calculation begins with an initial search step that identifies or characterizesthrough the use of grid pointsone or more regions on the protein surface that may be suitable for binding ligands to the receptor. Contour maps are then generated, producing hydrophobic and hydrophilic maps [28]. The hydrophilic maps are further divided into donor, acceptor, and metal-binding regions. The evaluation stage, which concludes the calculation, involves assessing each site by calculating various properties: the number of site points, a measure of the size of the site; exposure/enclosure, two properties providing different measures of how available the site is to the solvent; contact, which measures how strongly the average site point-interacts with the surrounding receptor via van der Waals nonbonding interactions; donor/acceptor character, a property related to the sizes and intensities of H-donor and H-acceptor regions; and SiteScore, an overall property based on the previous properties, constructed and calibrated so that the average SiteScore for a promising binding site is 1.0 [28]. Induced fit docking simulation of peptides To keep the receptor flexible in the docking protocol, we used a mixed molecular docking/dynamics protocol called induced fit docking (IFD) [29], as developed by Schrdinger, LLC (http://www.schrodinger.com/). The IFD protocol used in this study was carried out in three consecutive steps [30]. First, the ligand was docked into a rigid receptor model with scaled-down van der Waals (vdW) radii. A vdW scaling of 0.5 was used for both the protein and ligand nonpolar atoms. A constrained energy minimization was carried out on the protein structure, keeping it close to the original crystal structure while removing bad steric contacts. Energy minimization was carried out using the OPLS 2005 force-field with an implicit solvation model until default criteria were met. The active site predicted in SiteMap was used to define the location of the binding site and the dimension of the energy grids for initial docking. The Glide XP mode was used for the initial docking, and 20 ligand poses were retained for protein structural refinement. In the second step, Prime was used to generate the induced-fit proteinligand complexes. Each of the 20 structures from the previous step was subjected to side-chain and backbone refinements [30]. All residues with at least one atom located within 4.0 of each corresponding ligand pose were included in the Prime refinement. The refined complexes were ranked by Prime energy, and the receptor structures

J Mol Model

within 30 kcal mol1 of the minimum energy structure were put through to a final round of Glide docking and scoring. In the final step, each ligand was redocked into every refined low-energy receptor structure produced in the second step using Glide XP at default settings. The binding modes of three different peptides that are believed to be cleaved by nsP2 were theoretically identified by docking calculations. The prime goal of the nsP2 protein is to cleave the nonstructural protein complex into individual active proteins. No crystal structures or appropriate templates are available to model the nsP1 and nsP4 proteins, so we made use of junction peptide models to explain the proteinprotein interactions. The nsP2 protease cleaving sites are reported in the Swiss-Prot database; the cleavage site information was obtained by means of similarity analysis carried across alphavirus family members. In the database, two peptides are reported for every junction of nsP1nsP2, nsP2nsP3, and nsP3nsP4. In the molecular docking studies, we used the junction peptides as substrates; moreover, we considered that the flanking residues provide access to the real system. Energy-optimized pharmacophore mapping Energy-optimized pharmacophores (e-pharmacophores) are obtained by mapping the energetic terms from the Glide XP scoring function onto atom centers. The ligand is docked with Glide XP and the pose is refined. The Glide XP scoring terms are computed, and the energies are mapped onto atoms. Next, pharmacophore sites are generated, and the Glide XP energies from the atoms that comprise each pharmacophore site are summed. The sites are then ranked based on these energies, and the most favorable sites are selected for the pharmacophore hypothesis. These pharmacophores are then used as queries for virtual screening [31]. Pharmacophore sites were automatically generated from the proteinligand docked complex with Phase (Phase, v.3.0, Schrodinger, LLC) using the default set of six chemical features: hydrogen bond acceptor (A), hydrogen bond donor (D), hydrophobe (H), negative ionizable (N), positive ionizable (P), and aromatic ring (R). Phase treats most cationic groups as being exclusively positive ionizable. Hydrogen-bond acceptor sites were represented as vectors along the hydrogen bond axis in accordance with the hybridization of the acceptor atom. Hydrogen-bond donors were represented as projected points, located at the corresponding hydrogen-bond acceptor positions in the binding site. Projected points allow the possibility of structurally dissimilar active compounds forming hydrogen bonds to the same location, regardless of their point of origin and directionality [31]. Each pharmacophore feature site was first assigned an energetic value equal to the sum of the Glide XP

contributions of the atoms comprising the site. This allows sites to be quantified and ranked on the basis of these energetic terms. Glide XP descriptors include terms for hydrophobic enclosure, hydrophobically packed correlated hydrogen bonds, electrostatic rewards, stacking, cation, and other interactions [32]. Sites where less than half of the heavy atoms contribute to the pharmacophore feature were excluded from the final hypothesis. Thus, if only two heavy atoms in a six-membered ring exhibit energetic interactions, the ring is not considered a pharmacophore feature [31]. e-Pharmacophore database screening For the e-pharmacophore approach, explicit matching was required for the most energetically favorable site, provided that it scored better than 1.0 kcal mol1. Multiple sites were included in cases where more than one site had the top score. Screening molecules were required to match a minimum of 3 sites for a hypothesis with 3 or 4 sites and a minimum of 4 sites for a hypothesis with 5 or more sites. The distance matching tolerance was set to 2.0 as a balance between stringent and loose-fitting alignment. Screening of compounds was performed against Asinex (http://www.asinex.com), Maybridge (http://www.maybridge. com), TOSLab (http://www.toslab.com), Binding Database (http://www.bindingdb.org/bind/index.jsp), etc. Database hits were ranked in order of fitness score, a measure of how well the aligned ligand conformer matches the hypothesis based on RMSD site matching, vector alignments, and volume terms. The fitness scoring function is an equally weighted composite of these three terms and ranges from 0 to 3, as implemented in the default database screening of Phase. The ligands were selected based on the fitness score. The ligands with the best fitness scores were docked into the binding sites of the modeled protein [31]. ADME screening The QikProp program [33] was used to obtain the ADME properties of the compounds. This predicts both physically significant descriptors and pharmaceutically relevant properties. All of the compounds were neutralized before being used by QikProp. The neutralizing step is essential, as QikProp is unable to neutralize a structure and no properties will be generated in the normal mode. The program was processed in normal mode, and predicted 44 properties for the molecules, consisting of principal descriptors and physiochemical properties, along with a detailed analysis of log P (octanol/water), QP%, and log HERG. It also evaluated the acceptability of the compounds based on Lipinskis rule of five [34], which is essential for rational drug design.

J Mol Model

Results and discussion nsP2 model generation The model was generated based on the nsP2 protease of Venezuelan equine encephalitis alphavirus (VEEV); this model has similar structural features to the template protein (Fig. 1). The RMSD between the minimized protein model and the template structure was found to be 1.1 (Fig. 2).The modeled protein was energy minimized using the OPLS 2005 force-field. Both domains are composed of -helices and -strands (Fig. 3). The Nterminus is dominated by an -helix. The C-terminal domain contains helices and strands. The central -sheets are flanked by -helices. However, the function of the Cterminal domain of nsP2 is not clear [19]. The threedimensional structure provides valuable insight into molecular function and also enables the proteinprotein interaction to be analyzed.

Validation of the predicted structure The overall stereochemical quality of the model was assessed by PROCHECK. The Ramachandran plot showed 81.6% of the residues in the most favorable region, 17.4% in the allowed region, 0.7% in the generously allowed region and 0.3% in the disallowed region; the corresponding values for the 2HWK template were 88.6%, 11.0%, 0.0% and 0.4%, respectively. These results revealed that the majority of the amino acids are in a phi-psi distribution that is consistent with a right-handed -helix, and the model is reliable and of good quality (see Fig. 1 of the Electronic supplementary material, ESM). The G-factors, indicating the quality of the covalent, dihedral and overall bond angles, were 0.10 for dihedrals, 0.42 for covalent, and 0.11 overall. The overall main-chain and side-chain parameters, as evaluated by ProCheck, are all very favorable. The Ramachandran plot characteristic and G-factors confirm the quality of the predicted model. In order to investigate whether the

Fig. 1 Structure-based sequence alignment of VEEA_nsP2 (PDB ID: 2HWK) from Venezuelan equine encephalitis alphavirus nsP2 protease with the modeled Chikungunya virus nsP2 protease

(CHIKV_nsP2). The secondary structural elements are shown above the alignment. Active site and peptide interaction residues are highlighted with black triangles

J Mol Model

able to make the receptor flexible, which helps when refining the active sites. The active site of the protein lies in the C-terminal domain (Fig. 5). All three peptides interact with the C-terminal domain, except for the glycine of GAGI and the isoleucine of GGYI. Gly534-Ala535-Gly536-Ile537 (GAGI) interaction In total, GAGI forms six hydrogen bonds to the active site of the nsP2 protein (Fig. 6a). The Gly534 backbone nitrogen forms a trifurcated hydrogen bond with the oxygens of Ser1293, Glu1296 and Glu1157. However, in reality, the NH2 at the starting position can form only two interactions, as it is attached to the preceding residue and hence should be a backbone NH. Ala535 does not interact with the protein residues. The Gly536 backbone oxygen forms a hydrogen bond with the nitrogen of Gln1039. The Ile537 side-chain oxygen and nitrogen form hydrogen bonds with the oxygen and nitrogen of His1222 and Lys1239. Gly1332-Cys1333-Ala1334-Pro1335 (GCAP) interaction GCAP also forms six hydrogen bonds (Fig. 6b), and all of the peptides form hydrogen bonds with the nsP2 protein. The NH2 of Gly1332 forms a trifurcated H-bond with the oxygens of Ser1293, Glu1157 and Glu1297. However, in reality it can form only two bonds, as explained above. The backbone oxygen and nitrogen of Cys1333 and Ala1334 form hydrogen bonds with the nitrogen and oxygen side chains of His1222. The side-chain oxygen of Pro1335 forms a hydrogen bond with the nitrogen of Lys1045. Gly1862-Gly1863-Tyr1864-Ile1865 (GGYI) interaction GGYI forms seven hydrogen bonds with the nsP2 protein (Fig. 6c). Gly1862 forms a trifurcated H-bond with the

Fig. 2 Superimposition of template (magenta) and model protein (cyan)

interaction energy of each residue with the remainder of the protein is negative, a second test was done to apply energy criteria using a ProSA energy plot. The ProSA analysis of the model showed that almost all of the residues had negative interaction energies, with very few residues displaying positive interaction energies, as shown in Fig. 2 of the ESM. Molecular dynamics simulations Molecular dynamics can be used to explain protein structure function problems, such as folding, conformational flexibility and structural stability. In the simulations, we monitored the backbone atoms and the C--helix of the modeled protein. The RMSD values of the modeled structures backbone atoms were plotted as a time-dependent function of the MD simulation. The results support our modeled structure, as they show constant RMSD deviation throughout the whole simulation process. Graphs of potential energy, temperature, pressure and volume are shown in Fig. 3 of the ESM. The time dependence of the RMSD () of the backbone atoms of the modeled protein during a 5 ns simulation is shown in Fig. 4. The graph clearly indicates that there is a change in the RMSD from 1.0 to 3.0 in the nsP2 homology model during the first 1500 ps, but after that it reaches a plateau. The RMSD values of the backbone atoms in the system tend to converge after 2000 ps, showing fluctuations of around 1 . The low RMSD and the simulation time indicate that, as expected, the 3D structural model of nsP2 represents a stable folding conformation. Induced fit docking (IFD) results for peptides Molecular docking simulation helps us to understand the plausible binding modes and interactions. In IFD, we are

Fig. 3 Ribbon diagram of the modeled Chikungunya virus nsp2 protease (CHIKV_nsP2) showing the N- and C-terminal domains. -Helices, strands and loops are colored red, yellow and green, respectively

J Mol Model Fig. 4 RMSD of the backbone atoms of the modeled protein over a time period of 5 ns

same residue, but only two H-bonds are possible in reality. The Tyr1864 backbone nitrogen and oxygen form hydrogen bonds with the oxygen of His1222 and nitrogen of Lys1239. The phenolic group of tyrosine forms a hydrogen bond with the oxygen of Gly1176. The backbone nitrogen of Ile1865 forms a hydrogen bond with the nitrogen of Lys1045. The Glide scores, Glide energies and IFD scores are shown in Table 1. e-Pharmacophore development and screening of the database The e-pharmacophore combines aspects of structure-based and ligand-based techniques. Incorporating proteinligand contacts into ligand-based pharmacophore approaches has been shown to produce enhanced enrichments over using ligand information alone. The method described here attempts to take a step beyond simple contact scoring by

incorporating structural and energetic information using the scoring function in Glide XP. Seven pharmacophore sites were predicted, but only five pharmacophore sites are chosen based on the score. The final hypothesis consists of a hydrophobic group (H), an aromatic ring (R), and three H-bond donors (D), and their distances are shown in Fig. 7a and b, respectively. The hydrophobic site H23 lies in the COOH group of Ile537 (GAGI), while the donor D10 lies in the phenol group of Tyr1864 (GGYI). The donor D12 lies in the amino group of Gly536 (GAGI), Cys1333 (GCAP) and Tyr1864 (GGYI), and the donor D16 lies in the amino group of Gly534 (GAGI) and Gly1862 (GGYI). These energetically favorable sites encompass the specific interactions of junction peptides and the nsP2 protein, and this information should prove helpful in the development of new nsP2 inhibitors. With this pharmacophore hypothesis, compound screening was performed against Asinex, Maybridge, Binding Database, TOS Lab, etc.: a total of more than 300,000 compounds. To further enrich the screening, excluded volumes were added to the hypotheses. Receptor-based excluded volumes were included in order to help reduce false positives by eliminating inactive compounds that cannot simultaneously match the hypothesis and avoid clashing with the receptor. Compounds with fitness scores of more than 1.0 were subjected to Glide high-throughput virtual screening (HTVS). Glide extra precision docking Molecular docking is a computational technique that samples conformations of small compounds at proteinbinding sites; scoring functions are used to assess which of these conformations best complement the protein-binding site. There are two main aspects to assessing the quality of docking methods: (i) docking accuracy, which recognizes the true binding mode of the ligand to the target protein,

Fig. 5 Electrostatic potential surface of the CHIKV_nsP2 and its active site pocket. The positively and negatively charged surface regions are shown in blue and red, respectively

J Mol Model Fig. 6 Binding modes of the junction peptides Gly534-Ala535Gly536-Ile537 (a), Gly1332Cys1333-Ala1334-Pro1335 (b) and Gly1862-Gly1863-Tyr1864Ile1865 (c), based on induced fit docking results

and (ii) screening enrichment, which measures how much better a docking method is at identifying true binding ligands than random screening. To speed up the screening of a large set of compound databases, we took the refined active site from IFD and made

this active site rigid while screening in HTVS and Glide XP docking. Twenty compounds were selected from the HTVS for further Glide XP docking study based on the Glide score. Here, we report the four compounds from this Glide XP docking study with the best Glide scores (7.5 to 9.8) and

Table 1 The peptides used in the docking simulations and their corresponding Glide scores and Glide energies
Peptide Junction Glide score Glide energy (kcal/mol) 48.743 IFD scorea Interaction (DHA)b NHO Ser1293) NHO(Glu1296) NHO(Glu1157) NH(Gln1039)O NH(Lys1239)O NHO(His1222) Gly1332-Cys1333-Ala1334-Pro1335 (GCAP) nsP2nsP3 7.399 46.500 631.359 NHO(Ser1293) NHO(Glu1157) NHO(Glu1296) NH(His1222)O NHO(His1222) NH(Lys1045)O Gly1862-Gly1863-Tyr1864-Ile1865 (GCYI) nsP3nsP4 12.045 59.956 635.896 NHO(Ser1293) NHO(Glu1157) NHO(Glu1296) NHO(His1222) OH(Gly1176)O NH(Lys1239)O NH(Lys1045)N
a b

H-bond length () 1.830 1.833 1.649 2.008 2.073 1.828 1.983 1.643 2.037 1.914 1.712 2.107 2.454 1.699 2.138 1.820 1.931 2.228 2.039

Gly534-Ala535-Gly536-Ile537 (GAGI)

nsP1nsP2

7.538

631.584

IFD score induced fit docking score D donor, H hydrogen, A acceptor

J Mol Model

Fig. 7 Common pharmacophore hypothesis (DDDHR) based on the alignment of junction peptides (a) and the distance between the pharmacophore sites (b). Geometry of the pharmacophore. Orange torus aromatic ring feature, green sphere hydrophobic feature, light blue sphere donor feature

Glide energies (29 to 49 kcal mol1), which suggest strong enzymeligand interactions. The chemical names of the four lead compounds and their corresponding database identity (ID) numbers are: [5-(5-fluoro-2,4-dioxopyrimidin-1-yl)-3,4dihydroxyoxolan-2-yl]methyl phosphate (27943: Binding Database); 4-[hydroxy-(4-methylphenyl)methylidene]-5-(3hydroxyphenyl)-1-(2-hydroxypropyl)pyrrolidine-2,3-dione (21362: TOSLab); N-(2-methyl-4-nitrophenyl)-2-[[5-(6-oxocyclohexa-2,4-dien-1-ylidene)-1,2-dihydro-1,2,4-triazol-3-yl] sulfanyl]acetamide (ASN 01107557: Asinex); and N-(5Fig. 8 Structures of the four lead molecules along with their compound IDs and database names

ethyl-1,3,4-thiadiazol-2-yl)-2-[[5-(6-oxocyclohexa-2, 4-dien1-ylidene)-1,2-dihydro-1,2,4-triazol-3-yl]sulfanyl]acetamide (ASN 01541696: Asinex). The chemical structures of these lead molecules are illustrated in Fig. 8, and the binding modes of these four lead molecules and their interacting residues are shown in Fig. 9ad and Table 2. The residues Lys1045, Gly1176, His1222 and Lys1239 are involved in ligand interactions and are also important residues for the peptide interaction. The nsP2 protease sequences available in UniProt were retrieved (UniProt IDs: Q8JUX6, D7R997,

J Mol Model Fig. 9 Binding modes of the four potential ligands to the active site of nsP2 protease. The compound database IDs of the lead molecules are as follows: 27943: Binding Database (a), 21362: TOSLab (b), ASN 01107557: Asinex (c), ASN 01541696: Asinex (d)

D7R977, D7R966, D7R942, D7R973, A6MH22, D7R926, D7R999, D7R979, A0FJ31, D7R944, D7R9A1, Q1H8W7, Q1EL94, C7G0W2, A6MH12, B4YIR3, A9LMA5, A5Y7Y4, D2KBP9, C7AE71, Q1W368, D2KBQ5, D2CY36 and C7ADY7) and multiple sequence alignment was performed in ClustalW. We found that there were no changes in the residues at the active site we reported. Multiple sequence alignment of the reported sequences for nsP2 protease is shown in Fig. 4 of the ESM. Hence, the reported active site does not change very often .

Predicted ADME properties We analyzed 44 physically significant descriptors [33] and pharmaceutically relevant properties of the four lead compounds, including molecular weight, H-bond donors, H-bond acceptors, log P (octanol/water), log P MDCK, log Kp (skin permeability), humoral absorption, and their positions according to Lipinskis rule of five (Table 3 and Table 4). Lipinskis rule of five is a rule of thumb to evaluate drug likeness; in other words, to determine if a

Table 2 Pharmacophore results for the best hit compounds, and Glide XP docking results Lead moleculesa 27943 Interaction (DHA)b OH O(His1222) OH O(His1222) NH(Lys1239)O OH O(Gly1176) NH(His1222)O OH O(His1222) NH(Lys1239)O NH(Lys1045)O OHO(Leu1203) NH(Lys1045)O NHO(Leu1203) NHO(Lys1239) NHO(Leu1202) NH(Lys1045)....O NH(Lys1045)....O H-bond length () 1.982 1.932 1.766 1.974 2.161 1.621 1.706 1.893 2.156 1.833 1.812 2.956 1.848 1.790 2.707 Align score 1.015 Fitness score 1.029 Glide score 9.601 Glide energy (kcal/mol) 42.576

21362

0.927

1.046

9.279

48.690

ASN 01107557

0.981

1.170

8.57

32.436

ASN 01541696

1.240

1.009

7.640

29.999

a b

Ligand IDs are: 43077: Binding Database; 21362: TOSLab; ASN 01107557 and ASN 01541696: Asinex database D donor, H hydrogen, A acceptor

J Mol Model Table 3 Principal descriptors calculated by Qikprop simulation Lead moleculesa 27943 21362 ASN 01107557 ASN 01541696
a b c d e

Molecular weightb (g/mol) 342.174 367.401 385.397 362.424

Molecular volumec () 851.553 1145.733 1154.389 1099.505

PSAd 192.165 118.162 139.73 130.068

HB donorse 5.000 2.000 3.000 3.000

HB acceptorsf 13.600 7.200 6.750 7.750

Rotatable bondsg 3 4 4 5

Ligand IDs: 43077: Binding Database; 21362: TOSLab; ASN 01107557 and ASN 01541696: Asinex database Molecular weight of the molecule Total solvent-accessible volume in cubic angstroms using a probe with a radius of 1.4 Van der Waals surface areas of polar nitrogen and oxygen atoms

Estimated number of hydrogen bonds that would be donated by the solute to water molecules in an aqueous solution. Values are averages taken over a number of configurations, so they can be non-integer
f Estimated number of hydrogen bonds that would be accepted by the solute from water molecules in an aqueous solution. Values are averages taken over a number of configurations, so they can be non-integer g

Number of rotatable bonds

chemical compound with a certain pharmacological or biological activity has properties that would likely make it an orally active drug in humans. The rule describes molecular properties that are important in the drugs pharmacokinetics in the human body, including its ADME. However, the rule does not predict whether a compound is pharmacologically active. The four selected compounds were in the acceptable range of Lipinskis rule of five. For the four lead compounds, the partition coefficient (QP log P(o/w)) and the water solubility (QP log S), which are crucial when estimating the absorption and distribution of drugs within the body, ranged between 1.690 to 1.727 and 1.582 to 4.691, respectively, while the cell permeability (QP PCaco), a key factor governing drug metabolism and its access to biological membranes, ranged from 0.345 to 95. Overall, the percentage human oral absorptions for the compounds ranged from 25% to 100%.

All of these pharmacokinetic parameters are within the acceptable range defined for human use, thereby indicating their potential for use as drug-like molecules.

Conclusions Our main objective of this work was to identify the residues involved in the cleavage mechanism through theoretical calculations. The identification of inhibitors for Chikungunya virus has been hampered but a lack of structural insight into any proteins. Therefore, we have chosen to model the nsP2 protein, which plays a vital role in activating the nonstructural protein complex by cleaving the proteins into subunits of nsP1, nsP2, nsP3 and nsP4. The model was further validated by molecular dynamics simulation and various validation

Table 4 Physiochemical descriptors calculated by Qikprop simulation Lead moleculesa 27943 21362 ASN 01107557 ASN 01541696
a b c

QP log P(o/w)b 1.690 2.238 2.228 1.727

QP log Sc 1.582 4.109 5.106 4.691

QP PCacod 0.345 112.378 54.703 95.051

QP log HERGe 0.090 5.323 6.284 5.903

QP PMDCKf 0.269 46.585 31.997 90.432

% Human oral absorptiong 1 3 3 3

Ligand IDs: 43077: Binding Database; 21362: TOSLab; ASN 01107557 and ASN 01541696: Asinex database QP log P for octanol/water (2.0, 6.5)

Predicted aqueous solubility, log S. S in mol dm3 is the concentration of the solute in a saturated solution that is in equilibrium with the crystalline solid (6.5, 0.5)
d e f g

Apparent Caco-2 permeability (nm/s) (<25 poor, >500 great) log HERG, HERG K+channel blockage (concern below 5) Apparent MDCK permeability (nm/s) (<25 poor, >500 great) % Human oral absorption in GI (20%) (<25% is poor)

J Mol Model

tools. Again, the model was subjected to flexible peptide docking and further e-pharmacophore mapping was carried out. Ligands that had a fitness score of more than 1.0 were subjected to a rigid docking study. As per our docking analysis, the residues Gln1039, Lys1045, Glu1157, Gly1176, His1222, Lys1239, Ser1293, Glu1296 and Met1297 show crucial interactions with the nonstructural protein complex to be cleaved, and were considered an individual functional unit. Chikungunya virus replication and propagation depends on the nsP2 protein; so a chemical compound that inhibits this protein by targeting the key residues specified above will be potentially applicable therapeutically. Based on the docking results, we can report four chemical compounds that may be potential inhibitors of nsP2 protease. Furthermore, the backbone structural scaffolds of these four lead compounds could serve as building blocks in the design of drug-like molecules for the treatment of Chikungunya viral fever. Besides targeting the Chikungunya virus, the inhibitors may act against other members of the Alphavirus genus due to the high sequence similarity among alphavirus proteins, which thus provides a clear potential path towards the identification of broad-spectrum drugs.
Acknowledgments The authors like to thank the Department of Bioinformatics, Alagappa University, Karaikudi, India for its support and providing the facilities for this work.

References
1. Sourisseau M, Schilte C, Casartelli N, Trouillet C, Guivel-Benhassine F, Rudnicka D, Sol-Foulon N, Roux KL, Prevost M-C, Fsihi H, Frenkiel M-P, Blanchet F, Afonso PV, Ceccaldi P-E, Ozden S, Gessain A, Schuffenecker I, Verhasselt B, Zamborlini A, Saib A, Rey FA, Arenzana-Seisdedos F, Despres P, Michault A, Albert ML, Schwartz O (2007) Characterization of reemerging Chikungunya virus. PLoS Pathog 3:804817 2. Schuffenecker I, Iteman I, Michault A, Murri S, Frangeul L, Vaney M-C, Lavenir R, Pardigon N, Reynes J-M, Pettinelli F, Biscornet L, Diancourt L, Michel S, Duquerroy S, Guigon G, Frenkiel M-P, Brhin A-C, Cubito N, Desprs P, Kunst F, Rey F, Zeller H, Brisse S (2006) Genome microevolution of chikungunya viruses causing the Indian Ocean outbreak. PLoS Med 3:1058 1070 3. Ng LFP, Chow A, Sun YJ, Kwek DJC, Lim PL, Dimatatac F, Ng LC, Ooi EE, Choo KH, Her Z, Kourilsky P, Leo YS (2009) IL-, IL-6, and RANTES as biomarkers of Chikungunya severity. PLoS ONE 4:e4261 4. Vanlandingham DL, Hong C, Klingler K, Tsetsarkin K, McElroy KL, Powers AM, Lehane MJ, Higgs S (2005) Differential infectivities of Onyong-nyong and Chikungunya virus isolates in anopheles gambiae and Aedes aegypti mosquitoes. Am J Trop Med Hyg 72:616621 5. Chopra A, Anuradha V, Lagoo-Joshi V, Kunjir V, Salvi S, Saluja M (2008) Chikungunya virus aches and pains: an emerging challenge. Arthritis Rheum 58:29212922 6. Pardigon N (2009) The biology of chikungunya: a brief review of what we still do not know. Pathol Biol 57:127132

7. Takkinen K (1986) Complete nucleotide sequence of the nonstructural protein genes of Semliki Forest virus. Nucleic Acids Res 14:56675682 8. Khan AH, Morita K, MdC P, Hasebe F, Mathenge EGM, Igarashi A (2002) Complete nucleotide sequence of Chikungunya virus and evidence for an internal polyadenylation site. J Gen Virol 83:30753084 9. Strauss EGS, Strauss JH (1986) Structure and replication of the alphavirus genome. The Togaviridae and Flaviridae. Plenum, New York, pp 3590 10. Lulla A, Lulla V, Tints K, Ahola T, Merits A (2006) Molecular determinants of substrate specificity for Semliki Forest virus nonstructural protease. J Virol 80:54135422 11. Perri S, Driver DA, Gardner JP, Sherrill S, Belli BA, Dubensky TW Jr, Polo JM (2000) Replicon vectors derived from Sindbis virus and Semliki Forest virus that establish persistent replication in host cells. J Virol 74:98029807 12. Zhang D, Tzsr J, Waugh DS (2009) Molecular cloning, overproduction, purification and biochemical characterization of the p39 nsp2 protease domains encoded by three alphaviruses. Protein Expr Purif 64:8997 13. Vasiljeva L, Merits A, Auvinen P, Kaariainen L (2000) Identification of a novel function of the alphavirus capping apparatus. J Biol Chem 275:1728117287 14. Li L, Darden T, Hiskey R, Pedersen L (1996) Homology modeling and molecular dynamics simulations of the Gla domains of human coagulation factor IX and its G[12]A mutant. J Phys Chem 100:24752479 15. Kevin JB, Edmond C, Huafeng X, Ron OD, Michael PE, Brent AG, John LK, Istvan K, Mark AM, Federico DS, John KS, Yibing S, David ES (2006) Scalable algorithms for molecular dynamics simulations on commodity clusters. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing. ACM, Tampa 16. Shaw DE (2005) A fast, scalable method for the parallel evaluation of distance-limited pairwise particle interactions. J Comput Chem 26:13181328 17. Bairoch A, Boeckmann B, Ferro S, Gasteiger E (2004) SwissProt: juggling between evolution and stability. Brief Bioinform 5:3955 18. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:33893402 19. Russo AT, White MA, Watowich SJ (2006) The crystal structure of the Venezuelan equine encephalitis alphavirus nsP2 protease. Structure 14:14491458 20. Schrdinger, LLC (2009) Prime, version 2.1. Schrdinger, LLC, New York 21. Jorgensen WL, Maxwell DS, Tirado-Rives J (1996) Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids. J Am Chem Soc 118:1122511236 22. Laskowski RA, MacArthur MW, Moss DS, Thornton JM (1993) PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Crystallogr 26:283291 23. Sippl MJ (1993) Recognition of errors in three-dimensional structures of proteins. Proteins Struct Funct Bioinf 17:355 362 24. PyMOL (2010) PyMOL molecular graphics system website. http://www.pymol.org, accessed 2010 25. Bowers KJ, Dror RO, Shaw DE (2006) The midpoint method for parallelization of particle simulations. J Chem Phys 124:184109 184111 26. Kaminski GA, Friesner RA, Tirado-Rives J, Jorgensen WL (2001) Evaluation and reparametrization of the OPLS-AA force field for

J Mol Model proteins via comparison with accurate quantum chemical calculations on peptides. J Phys Chem B 105:64746487 Jrgensen AM, Topiol S (2008) Driving forces for ligand migration in the leucine transporter. Chem Biol Drug Des 72:265272 Lauria A, Ippolito M, Almerico AM (2009) Inside the Hsp90 inhibitors binding mode through induced fit docking. J Mol Graph Model 27:712722 Wang H, Aslanian R, Madison VS (2008) Induced-fit docking of mometasone furoate and further evidence for glucocorticoid receptor 17[alpha] pocket flexibility. J Mol Graph 27:512521 Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, Repasky MP, Knoll EH, Shelley M, Perry JK, Shaw DE, Francis P, Shenkin PS (2004) Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy. J Med Chem 47:17391749 Salam NK, Nuti R, Sherman W (2009) Novel method for generating structure-based pharmacophores using energetic analysis. J Chem Inf Model 49:23562368 Nabuurs SB, Wagener M, de Vlieg J (2007) A flexible approach to induced fit docking. J Med Chem 50:65076518 Duffy EM, Jorgensen WL (2000) Prediction of properties from simulations: free energies of solvation in hexadecane, octanol, and water. J Am Chem 122:28782888 Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (1997) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 23:325

27.

31.

28.

32. 33.

29.

34.

30.

Das könnte Ihnen auch gefallen