Beruflich Dokumente
Kultur Dokumente
Doctoral Fellow Animal Biotechnology Center Veterinary Physiology & Biochemistry College of Veterinary & Animal Sciences Pantnagar, INDIA - 263145.
OUT LINE
Introduction
Approaches &
Tools &
Principles
Methods
Case Study
Introduction
By April, 2010, there were more than 6,800,000 protein sequences in the non-redundant protein sequence database at NCBI and fewer than 50,000 protein structures in the Protein Data Bank.
(Present estimates of proteins without a structure approx. 9 million)
The only way to bridge the ever growing gap between protein sequence and structure is computational structure modeling.
- Kryshtafovych & Fidelis, 2010, Drug Discov Today, 14: 386-393
An amino acid sequence carries all the information needed to guide protein folding into a specific spatial shape of a protein.
- Sela et al., 1957, Science, 125: 691-692
2 approaches: Template Based Approach o Homology Modeling o Threading Free Modeling/ De novo Approach o Ab initio Modeling
Homology Modeling
Comparative Modeling Protein structure is much more evolutionarily conserved than sequence and therefore similar sequences normally yield similar 3D structures.
-Chothia & Lesk, 1986, EMBO J, 5: 823-826
New families are being discovered at a rate that is linear with the addition of new sequences.
-Yooseph et al., 2007, PLOS Biol, 5: e16
Threading
Fold Recognition Even if a template cannot be identified using sequence similarity, suitable templates may still exist as Nature tolerates only a limited number of folds. Scans the query sequence against a database of solved structures; a scoring function is used to assess the compatibility of the sequence to the structure.
Ab initio Modeling
Based on general principles that govern protein folding energetics and/or statistical tendencies of conformational features that native structures acquire.
Good News!
Currently available template-based methods reliably generate accurate models, comparable in quality to the structures solved by X-ray crystallography. The level of detail is sufficient for DRUG DESIGN, detecting PROTEIN INTERACTIONS, understanding REACTION MECHANISMS, interpretation of MUTATIONS and molecular replacement in solving CRYSTAL STRUCTURES.
-Baker & Sali, 2001, Science, 294: 93-96
The performance gap between the best servers and the best human-expert groups is narrowing over time.
-Kryshtafovych et al., 2007, Proteins, 69: 194-207
Getting Started:
Primary Structure? Amino-acid sequence (your own or database) In silico translation of nucleotide sequence (your own or database)
Primary databases: NCBI, EMBL, DDBJ Protein database: Uni-Prot, SWISS-PROT/ TrEMBL, PIR, IPI, NCBI(Protein)
Template Search:
BLASTp (NCBI)
PSI-BLAST against PDB Select best template(s) and note the pdb id [low evalue, high similarity, high coverage]
Retrieve template structure (.pdb file format)
Modeling:
Provide query sequence and template pdb id/ template file
Many servers do not require template files
PROTEIN MODEL
Servers: TASSER SPARKSx PEP-FOLD Swiss Model Rosetta Design ModWeb QUARK HHpred Phyre RaptorX EsyPred 3D Jigsaw RosettaAntibody Bhageerath & many others
Indices of Quality:
Steric Clashes & Bumps Packing Density Ramachandran plot Deviation from template Conservation of secondary structure G-score Q-value Z-value
QA/ QC Completed
Active residue prediction (ConSurf, NCBI-CDD, InterProScan) Active site prediction (POCKET FINDER, 3D2GO, fPOCKET, LIGSITE, MetaPocket, 3D LigandSite, LigPlot, NCBI-CDD, InterProScan) Antigenic profiling (IEDB, IMTECH) Protein Dynamics (CAVITY, CASTp, SLITHER, NCBI-CDD, InterProScan) Drug design/ Ligand docking (Hex, PATCH DOCK, Z DOCK, HADDOCK, Docking Server, GRAMM-X, Flex Pep Dock) Function prediction (ConFunc, ProKnow, KPFP, InterProScan, Pfam, NCBI-CDD, KAAS) Physical & Chemical Profiling (ProtParam, molbiol-tools.ca, ProSAL, Scratch Protein Predictor) Protein-protein interactions (Hex, KB Dock) Toxicity profiling (OSIRIS Actelion Property Explorer, Molsoft Drug Likeness Explorer)
(Again, not all these require a 3D structure of your protein as query, though, many do)
Template Selection and Retrieval of Template Structure (PSI-BLAST against PDB) Generation & Refinement of buPAG-2 Model (MODELLER9v10, SWISS-PDB View, What If Web Interface)
Quality Assessment (ERRAT, PROCHECK, MUSTANG-RMSD)
Structural & Functional Annotation (NetOGlyc, NetNGlyc, YinOYang, ProtParam, NCBI-CDD, InterProScan, Pfam, KPFP, ProKnow)
buPAG-2 MODEL
QUALITY
Only 1 of the total 367 residues was present in the disallowed region, Quality Factor of 83.143%, RMSD of 0.447 over 353 residues, G-factor of -0.16.
OmpH
Pasteurella multocida
Cyclooxygenase 2
Canis lupus familiaris
Lipoxygenase 2
Canis lupus familiaris
SELECTED REFERENCES
- Kryshtafovych A, Fidelis K. 2010. Protein structure prediction and model quality assessment. Drug Discov Today, 14: 386-393. - Sela M, et al. 1957. Reductive cleavage of disulfide bridges in ribonuclease. Science, 125: 691-692. - Chothia C, Lesk AM. 1986. The relation between the divergence of sequence and structure in proteins. EMBO J, 5: 823-826. - Yooseph S, et al. 2007. The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLOS Biol, 5: e16. - Baker D, Sali A. 2001. Protein structure prediction and structural genomics. Science, 294: 93-96. - Raimondo D, et al. 2007. Automatic procedure for using models of proteins in molecular replacement. Proteins, 66: 689-696. - Zhang Y. 2008. Progress and challenges in protein structure prediction. Curr Opin Struct Biol, 18: 342-348. - Kryshtafovych A et al. 2007. Progress from CASP6 to CASP7. Proteins, 69: 194-207. - Ganguly B, Prasad S. 2012. Homology modeling and functional annotation of bubaline pregnancy associated glycoprotein 2. J Ani Sci Biotech, 3: 13.
THANK YOU