Beruflich Dokumente
Kultur Dokumente
presented by
Rituparna Addy
Department of Biotechnology
Haldia Institute of Technology
Gene:
• A sequence of nucleotides coding for protein.
Central Dogma:
• Proposed in 1958 by Francis Crick.
• He postulated that all possible information
transferred, are not viable.
• He published a paper in 1970.
4 3 = 64 possible codons
CODONS:
• Discovered by Sydney Brenner and Francis Crick in
1961.
• In every triplet of nucleotides, each codon codes for
one amino acid in a protein.
1 2 3
DNA RNA PROTEIN PHENOTYPE
cDNA
1. TRANSCRIPTION
2. TRANSLATION
3. GENE EXPRESSION
4. REVERSE TRANSCRIPTION
DEfiniTION
• It is a prerequisite for detailed functional annotation
of genes and genomes.
• It can detect location of ORFs (Open Reading
Frames), structures of introns and exons.
• It describes all the genes computationally with near
100% accuracy.
• It can reduce the amount of experimental
verification work required.
TYPES
Homology-
Abinitio-based
based
• Exons INTERNAL
FINAL
SINGLE
Sequence signals
Start codon Stop codon
Genomic DNA
Transcription
Splicing
exon intron
GT AG
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
Probabilistic models
• Statistical description of a gene.
• Markov Models & Hidden Markov Models.
• Used to distinguish oligonucleotide distributions in
the coding regions from those for non-coding
regions.
• Probability of distribution of nucleotides in DNA
sequence depends on the order k.
• Types of order- zero, first and second.
• Order , gene can predicted more accurately.
ZERO FIRST SECOND
Each base occurs Occurrence depends on Preceding of two bases to
independently with a the base preceding it determine which base
given probability follows
Non-coding sequences --- Coding sequence
TYPICAL ATYPICAL
. I
Extent of Splice Signal Window: I i I 196
(Sn +Sp )
CC=
2
ME= proportion of missed exons & missed genes
WE= proportion of wrongly predicted exons & wrong
genes
cONCLUSION
• The computational prediction of genes are most
important process in genome & sequence analysis.
• The prediction can be easy for prokaryotes because
of non-interrupted genes. HMMs based predictions
provide best accuracy.
• Current algorithms are categorized ab initio,
homology & consensus based. The statistical &
homology information generate improved
performance of gene finding.
• With this advancement of computational techniques
the gene prediction process will become more
feasible.
REFERENCES
http://www.4ulr.com/products/currentprotocols/bioinformatics.html
http://proxy.lib.iastate.edu:2103/nrg/journal/v3/n9/full/nrg890_fs.html
http://proxy.lib.iastate.edu:2103/nrg/journal/v5/n4/full/nrg1315_fs.html
Xiong J.; Essential bioinformatics; QH 324.2.X56 2006
THANK YOU