Sie sind auf Seite 1von 21

Metabolic database

S.Vaidehi
Assistant Professor
D.G.Vaishnav college
Arumbakkam
Chennai-106
MetaCyc
 MetaCyc is a curated database of experimentally elucidated
metabolic pathways from all domains of life. MetaCyc contains
2609 pathways from 2914 different organisms.
 MetaCyc contains pathways involved in both primary and
secondary metabolism, as well as associated metabolites,
reactions, enzymes, and genes. The goal of MetaCyc is to catalog
the universe of metabolism by storing a representative sample of
each experimentally elucidated pathway.
 MetaCyc applications include:
 Online encyclopedia of metabolism
 Predict metabolic pathways in sequenced genomes
 Support metabolic engineering via enzyme database
 Metabolite database aids metabolomics research
 MetaCyc is a member of the BioCyc collection of
Pathway/Genome Database.
 a multiorganism DB.
 contains experimentally elucidated pathways only;
 “what are all the pathways for arginine degradation in microbes,”
or “what cofactor biosynthesis pathways are known in bacteria?”
For questions that require information about the complete
genome, proteome, or metabolic network of an organism,
instead consult the organism-specific PGDB. For example,
MetaCyc contains 14 pathways that have been experimentally
studied in Staphylococcus aureus, and 36 enzymes that
participate in these pathways. In contrast, the BioCyc
Staphylococcus aureus RF122 PGDB contains 189 pathways (most
of which are computationally predicted), plus the entire genome
and proteome of that strain.
 MetaCyc is a database of non-redundant, experimentally elucidated metabolic
pathways and enzymes. It also contains reactions, chemical compounds, and
genes. It stores predominantly qualitative information rather than quantitative
data, although it does contain some quantitative data such as enzyme kinetics
data. “MetaCyc” is pronounced “met-a-sike”. It sounds like “encyclopedia”.
 A unique property of MetaCyc is that it is curated[def] from the scientific
experimental literature according to an extensive process [4], such that:
 2,740 different organisms are represented, with the majority of pathways
occuring in microorganisms and plants
 2,411 metabolic pathways are stored, with 13,074 enzymatic reactions and 47,838
associated literature citations
 MetaCyc stores all enzyme-catalyzed reactions that have been assigned EC
numbers by the Nomenclature Committee of the International Union of
Biochemistry and Molecular Biology (NC-IUBMB)
 MetaCyc also stores thousands of additional enzyme-catalyzed reactions that
have not yet been assigned an EC number
 MetaCyc stores pathways involved in [Primary metabolism]
and [Secondary metabolism].
 MetaCyc also stores metabolites, enzymes, enzyme
complexes, and genes associated with these pathways.
 MetaCyc is extensively linked to other biological databases
[8] containing protein and nucleic-acid sequence data,
bibliographic data and protein structures.
 Unlike EcoCyc, MetaCyc provides little genomic data.
MetaCyc does contain objects for the genes that encode
most of the enzymes within the DB, but MetaCyc contains
no sequence data. It does contain links to external
sequence databases.
 A unique property of MetaCyc is that it is curated[def] from the
scientific experimental literature according to an extensive
process [4], such that:
 2,740 different organisms are represented, with the majority of
pathways occuring in microorganisms and plants
 2,411 metabolic pathways are stored, with 13,074 enzymatic
reactions and 47,838 associated literature citations
 MetaCyc stores all enzyme-catalyzed reactions that have been
assigned EC numbers by the Nomenclature Committee of the
International Union of Biochemistry and Molecular Biology
(NC-IUBMB)
 MetaCyc also stores thousands of additional enzyme-catalyzed
reactions that h
 MetaCyc data can be browsed and queried in several different ways. For
pathways, proteins, reactions and compounds, the MetaCyc site supports:
 Text-based searching, when trying to find information without knowing
exactly how an object is named [example]
 Browsing using ontologies, when one wants to search by proceeding from
general categories to specific instance [example]
 Direct queries, when an identifier is known [example]
 Comparison features combine MetaCyc with other BioCyc databases to provide
additional ways for viewing data. Examples for Cross-Species comparisons
include:
 Comparing specific pathways between two or more organisms [example]
 Comparing the genomic maps of two or more organisms [example]
 Additionally, a desktop version of the software provides substantially more
powerful capabilities. When installed locally with multiple organism-specific
databases, the desktop version enables several powerful capabilities, such as:
 Comparing the overall metabolic networks of different organisms
[example]
 Curation is the process of manually refining and updating a bioinformatics
database. The MetaCyc project uses a literature-based curation approach in
which database contents are extracted in a step-wise manner from evidence in
the experimental literature, as depicted below.
 The curation procedures that MetaCyc curators follow are described in the
Curator’s Guide to Pathway/Genome Databases.
 MetaCyc data are derived from primary literature, from reviews, and from
external databases.
 For certain organisms, some of the data within MetaCyc have been directly
imported from other databases which we consider to be the authoritative
sources of data on those organisms:
 Arabidopsis thaliana: AraCyc
 Escherichia coli K-12: EcoCyc
 Saccharomyces cerevisiae S288C: Yeast Biochemical Pathways
 Streptomyces coelicolor A3(2): ScoCyc
BioCyc
 The BioCyc collection of Pathway/Genome Databases (PGDBs) provides a reference on
the genomes and metabolic pathways of thousands of sequenced organisms. BioCyc
PGDBs are generated by software that predict the metabolic pathways of completely
sequenced organisms, predict which genes code for missing enzymes in metabolic
pathways, and predict operons. BioCyc also integrates information from other
bioinformatics databases, such as protein feature and Gene Ontology information from
UniProt. The BioCyc website provides a suite of software tools for database searching and
visualization, for omics data analysis, and for comparative genomics and comparative
pathway questions.
 BioCyc databases (DBs) are organized into tiers according to the amount of manual
updating they have received.
 Tier 1 PGDBs received intensive manual efforts, and are updated frequently [details of
Tier 1].
 Tier 2 PGDBs were computationally generated by the PathoLogic program, and have
received moderate manual updating. [details of Tier 2]
 Tier 3 PGDBs were computationally generated by PathoLogic, and have received no
manual updates. [details of Tier 3]
 PGDBs have been created by scientists outside SRI for organisms not present in BioCyc
EcoCyc
 EcoCyc E. coli Database
 EcoCyc is a scientific database for the bacterium
Escherichia coli K-12 MG1655. The EcoCyc project
performs literature-based curation of the entire
genome, and of transcriptional regulation,
transporters, and metabolic pathways.
EMP
 The Enzymes and Metabolic Pathways database (EMP)
is an encoding of the contents of over 10 000 original
publications on the topics of enzymology and
metabolism. This large body of information has been
transformed into a queryable database. An extraction
of over 1800 pictorial representations of metabolic
pathways from this collection is freely available on the
World Wide Web. We believe that this collection will
play an important role in the interpretation of genetic
sequence data, as well as offering a meaningful
framework for the integration of many other forms of
biological data.
 The curation of the encoded data and of the pictorial
representations of pathways is an ongoing project centered
at the Laboratory of Mathematical Simulation of
Multienzyme Systems at the Institute of Theoretical and
Experimental Biophysics of the Russian Academy of
Sciences, in Pushchino, Russia. The effort to build EMP was
initiated in 1984.
 It can be browsed in the context of the World Wide Web
application PUMA, which can be reached via the following
URL:
http://www.mcs.anl.gov/home/compbio/PUMA/Productio
n/ puma.html 2. It can be acquired via anonymous FTP
from the BioBase server
 Table. Number of charts for organism and taxonomic
groups
 Number of CTable 1. Number of charts for organism
and taxonomic groups
 Number of Charts Organism/Taxonomic Group
 667 Escherichia coli 531 Haemophilus influenzae 418
Homo sapiens 379 Mammalia 135 Rattus norvegicus
397 Rodentia 115 Saccharomyces cerevisiae 316 Aves 101
Ascomycotina 57 Salmonella typhimurium 56
Oryctolagus cuniculus 52 Bos taurus 51 Embryophyta
47 Pseudomonasharts Organism/Taxonomic Group
Systematic naming
 The systematic name contains the initial substrates,
final products, the function of the pathway,
coenzymes, and cellular location of the pathway
enzymes. Every metabolic pathway record includes
this characteristic systematic pathway name. In
addition, each record includes a shorter, but still
unequivocal, recommended pathway name. Finally, a
set of common names for the pathway are also
encoded.
Eg
 Substrates–Products_Function_
(Coenzymes)_(Locations)_[Comment] For example,
one of the versions of the Entner–Doudoroff Pathway
encoded in the database is characterized by the name
D-glucose–pyruvate_catabolism_
(ATP,_NADP(’+),_NAD(’+),_ADP)_(cytosol) while the
recommended name would be Glucose–
pyruvate_catabolism_ [via D–glucono–1 ,5–lactone_6–
phosphate] and the common name would be Entner–
Doudoroff pathway
Rebase DB
 The Restriction Enzyme Database - Restriction
Enzyme data BASE
A collection of information about restriction enzymes and
related proteins. It contains published and unpublished
references, recognition and cleavage sites, isoschizomers,
commercial availability, methylation sensitivity, crystal,
genome, and sequence data. DNA methyltransferases,
homing endonucleases, nicking enzymes, specificity
subunits, control proteins, and helicase domain proteins
are also included. Putative DNA methyltransferases and
restriction enzymes, as predicted from analysis of genomic
sequences, are also listed. REBASE is updated daily and is
constantly expanding.
 search enzyme names - list enzyme names or partial names
No spaces within enzyme names (ie "FokI" not "Fok I").
recognition sequence - list recognition sequences (or partial)
author name - sorted by date - list last names
author name - sorted by name - list last names
For author searches, "Roberts" will match Roberts, Robertson,...
Add a comma after Roberts ("Roberts,") to leave out Robertson
journal name - find out if REBASE has any refs from a given journal
"NAR" will find "Nucleic Acids Res."
journal citations - list all records from a given journal
reference number - list ref nums (every REBASE ref has a unique id)
enzyme number - list enz nums (every REBASE enzyme has a unique id)
genbank number - list genbank numbers (or partial)
organism name - list organisms (or partial)
organism source - list sources of organisms (or partial)
search titles - newest first - enter meaningful keywords
search titles - sort by author - enter meaningful keywords
abstracts and titles - newest first - enter meaningful keywords
abstracts and titles - sort by author - enter meaningful keywords

Das könnte Ihnen auch gefallen