Sie sind auf Seite 1von 18

BIOLOGIA MOLECULAR - Experimental – 2015/2

Aula 8
Profa. Isabel Cristina Braga Rodrigues
isabelcbraga@ufsj.edu.br
CAP Bloco 2 Sala 218

INTRODUÇÃO À BIOINFORMÁTICA

1. Encontrar as sequências dos genes 16S rRNA dos microrganismos listados


abaixo:

Para isso iremos utilizar o seguinte banco de dados:


http://www.ncbi.nlm.nih.gov/nuccore GenBanK: banco de dados americano de
sequências de DNA e proteínas.

http://www.genome.jp/kegg/ este é um outro banco de dados que contém


sequências de vários organismos. Melhor para busca de genes envolvidos em vias
metabólicas.

a. Acidithiobacillus ferrooxidans

Acidithiobacillus ferrooxidans ATCC 23270, complete genome


NCBI Reference Sequence: NC_011761.1
GenBank Graphics
>gi|218665024:2769785-2771330

TTCTTTAAAGGAGGTGATCCAGCCGCAGGTTCCCCTACGGCTACCTTGTTAC
GACTTCACCCCAGTCATGAACCATACCGTGGTAACCGCCCTCCCGAAGGTTA
GGCTAGCTGCTTCTGGTACAATCCACTCCCATGGTGTGACGGGCGGTGTGTA
CAAGGCCCGGGAACGTATTCACCGCGGCATGCTGATCCGCGATTACTAGCG
ATTCCGACTTCATGCAGTCGAGTTGCAGACTGCAATCCGAACTACGACGCG
CTTTCTGGGGTCTGCTCCACCTCGCGGCTTGGCTTCCCTCTGTACGCGCCATT
GTAGCACGTGTGTAGCCCTGGACATAAAGGCCATGAGGACTTGACGTCATC
CCCACCTTCCTCCGGTTTGTCACCGGCAGTCTCCCTAGAGTGCCCGGCCGAA
CCGCTGGCAACTAAGGACAAGGGTTGCGCTCGTTGCGGGACTTAACCCAAC
ATCTCACGACACGAGCTGACGACAGCCATGCAGCACCTGTGTTCCGATTCCC
CGAAGGGCACTTCCGCATCTCTGCAGAATTCCGGACATGTCAAGCCCAGGT
AAGGTTCTTCGCGTTGCATCGAATTAAACCACATGCTCCACCGCTTGTGCGG
GCCCCCGTCAATTCCTTTGAGTTTTAACCTTGCGGCCGTACTTCCCAGGCGG
AATACTTATCGCGTTAGCTACGACACTCAGTACGCTAGGCACCAAACATCTA
GTATTCATCGTTTAGGGCGTGGACTACCAGGGTATCTAATCCTGTTTGCTCC
CCACGCTTTCGTGCCTCAGCGTCAGTATTGGGCCAGGTGGCCGCCTTCGCCA
CTGATGTTCCTCCAGATCTCTACGCATTTCACCGCTACACCTGGAATTCCAC
CACCCTCTCCCATACTCTAGTACACCGGTTTCCACCGCCATTCCCAGGTTGA
GCCCGGGGATTTCACGACAGACCTAACGTACCGCCTACGCACCCTTTACGCC
CAGTGATTCCGATTAACGCTTGCACCCCCCGTATTACCGCGGCTGCTGGCAC
GGAGTTAGCCGGTGCTTCTTCTTGGATTCACGTCAATAGCAGATTGTATTAG
AACCCACCTTTTCGTCCTCCACGAAAGGACTTTACAACCCGAAGGCCTTCTT
CATCCACGCGGCATTGCTTCGTCAGGGTTGCCCCCATTGCGAAAAATTCCCC
ACTGCTGCCTCCCGTAGGAGTCTGGGCCGTGTCTCAGTCCCAGTGTGGCTGG
TCGTCCTCTCAGACCAGCTACCGATCGTCGCCTTGGTGGGCCTTTACCCCGC
CAACTAGCTAATCGGACGTAGGCTCCTCTCTTAGCGCGAGGTCCGAAGATC
CCCCGCTTTCCCCCTCAGGGCTCATGCGGTATTAGCCCAAGTTTCCCTGGGT
TGTCCCCCACTAAAAGACAGATTCCTACGCATTACTCACCCGTCCGCCACTC
GTCAGCATCCGAAGACCTGTTACCGTTCGACTTGCATGTGTTAGGCATGCCG
CCAGCGTTCAATCTGAGCCAGGATCAAACTCTTAAGTTCAATC

b. Acidithiobacillus caldus

Acidithiobacillus caldus SM-1, complete genome


NCBI Reference Sequence: NC_015850.1
GenBank Graphics
>gi|340780744:2528355-2529900

TTTAAAGGAGGTGATCCAGCCGCAGGTTCCCCTACGGCTACCTTGTTACGAC
TTCACCCCAGTCATGAACCATACCGTGGTCGTCGCCCCCCCCGAAGGTTAGG
CTAACGGCTTCTGGTACCATCCACTCCCATGGTGTGACGGGCGGTGTGTACA
AGGCCCGGGAACGTATTCACCGCGGCATGCTGATCCGCGATTACTAGCGAT
TCCGACTTCATGCAGTCGAGTTGCAGACTGCAATCCGAACTACGACGCGCTT
TCTGGGGTCTGCTCCACCTCGCGGCTTGGCTTCCCTCTGTACGCGCCATTGT
AGCACGTGTGTAGCCCTGGACATAAAGGCCATGAGGACTTGACGTCATCCC
CACCTTCCTCCGGTTTGTCACCGGCAGTCTCCCTAGAGTGCCCGGCCGAACC
GCTGGCAACTAGGAACAAGGGTTGCGCTCGTTGCGGGACTTAACCCAACAT
CTCACGACACGAGCTGACGACAGCCATGCAGCACCTGTGTTCCGATTCCCC
GAAGGGCACCCCCACATCTCTGCAGGGTTCCGGACATGTCAAGCCCAGGTA
AGGTTCTTCGCGTTGCATCGAATTAAACCACATGCTCCACCGCTTGTGCGGG
CCCCCGTCAATTCCTTTGAGTTTTAACCTTGCGGCCGTACTTCCCAGGCGGG
ATACTTATCGCGTTAGCTACGACACTCAGCACCTAAGGCGCCAAACATCCA
GTATCCATCGTTTAGGGCGTGGACTACCAGGGTATCTAATCCTGTTTGCTCC
CCACGCTTTCGCGCCTCAGCGTCAGTATTGGGCCAGGTGACCGCCTTCGCCA
CTGGTGTTCCTCCAGATCTCTACGCATTTCACCGCTACACCTGGAATTCCAT
CACCCTCTCCCATACTCCAGTCAGCCCGTTTCCACCGCCATTCCCAGGTTGA
GCCCGGGGATTTCACGGCAGACGTAACCCACCGCCTACGCGCCCTTTACG
CCCAGTAATTCCGATTAACGCTCGCACCCTCCGTATTACCGCGGCTGCTGGC
ACGGAGTTAGCCGGTGCTTCTTCTTGGGTTCACGTCAACAGCAGACCGTATT
CGGATCCGCCTTTTCGTCCCCCACGAAAGGACTTTACAACCCGAAGGCCTTC
TTCATCCACGCGGCATTGCTTCGTCAGGGTTGCCCCCATTGCGAAAAATTCC
CCACTGCTGCCTCCCGTAGGAGTCTGGGCCGTGTCTCAGTCCCAGTGTGGCT
GGTCGTCCTCTCAGACCAGCTACCGATCGTCGCCTTGGTAGGCCTTTACCCC
ACCAACTAGCTAATCGGACGTAGGCCCCTCCTTCAGCACGAGGTCCGAAGA
TCCCCCGCTTTCCCCCTCAGGGCTTATGCGGTATTAGCCCAAGTTTCCCTGG
GTTGTCCCCCACAAAAGGATAGGTTCCTACGCATTACTCACCCGTCCGCCAC
TCGCCAGCATCCCGAAGGACCTGCTGCCGTACGACTTGCATGTGTTAGGCAT
GCCGCCAGCGTTCAATCTGAGCCAGGATCAAACTCTCGCGTTCAATC

c. Leptospirillum ferrooxidans

Leptospirillum ferrooxidans C2-3 DNA, complete genome


NCBI Reference Sequence: NC_017094.1
GenBank Graphics
>gi|383783295:1058571-1060143

ACAAAATGAATCTGGAGAGTTTGATCCTGGCTCAGAACGAACGCTGGCGGC
GTGCCTAACACATGCAAGTCCAACGTGAAAGGGGAGCAATCCCCCGGTAGG
GTGGCAAACGGGTGAGTAAGACATGGGTGATCTACCTTAGGGATGGGGATA
TCCTTCCGAAAGGAGGGGCAATACCGAATATTGTCCGGGACCATGAAGGGT
TCCGGGGAAAGGGAGGCCTCTGATACAAGCTTTCGCCTTAAGATGAGCCCA
TGGCCCATCAGCTAGTTGGTAGGGTAAAGGCCTACCAAGGCTACGACGGGT
CGCTGGTCTGAGAGGACGACCAGCCACACTGGCACTGAGATACGGGCCAGA
CTCCTACGGGAGGCAGCAGTGAGGAATATTGCGCAATGGGGGAAACCCTGA
CGCAGCAACGCCGCGTGTGGGAAGAAGGCCTTCGGGTCGTAAACCACTTTT
ACTCGGGACGAAAAAGGGATATCAAATAAATATCCCCGATGACGGTACCGT
GAGAATAAGCCACGGCTAACTCTGTGCCAGCAGCCGCGGTAAGACAGAGG
TGGCAAGCGTTGTTCGGAATTACTGGGCGTAAAGAGTCTGTAGGTGGTTTGT
CAAGTCTTTGGTGAAAGGCCGTAGCTTAACTATGGGAATGCCAAAGAGACT
GGCAGGCTGGAGGCTGGGAGAGGGAAGCGGAATTTCTGGTGTAGCGGTGA
AATGCGTAGATATCAGAAGGAAGGCCGGTGGCGAAGGCGGCTTCCTGGAAC
AGTCCTGACACTGAGAGACGAAAGCGTGGGGAGCAAACAGGATTAGATAC
CCTGGTAGTCCACGCCCTAAACGATGGGTACTAAGTGTGGAGGGGTTAAAC
CCTCCGTGCCGCAGCAAACGCAGTAAGTACCCCGCCTGGGGAGTACGGCCG
CAAGGTTGAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGTGC
ATGTGGTTTAATTCGACGCAACGCGAGGAACCTTACCTAGGCTTGACATGTG
GTCAGTAGCGAACCGAAAGGGGAGCGACCCGTCAAATCGGGCAATCACAC
AGGTGCTGCATGGCTGTCGTCAGCTCGTGCCGTGAGGTGTTGGGTTCAGTCC
CGCAACGAGCGCAACCCTCGCCCTTTGTTGCCATCGGGTAAAGCCGGGCAC
TCTAAGGGGACTGCCAGCGACAAGTTGGAGGAAGGAGAGGATGACGTCAA
GTCATCATGGCCTTTATGCCTAGGGCCACACACGTGCAACAATGGCCGGTA
CAGACGGAGGCAATGCCGAGAGGCGGAGCAAACCCGAGAAAACCGGTCCC
AGTTCGGATTGAGGTCTGCAACTCGACCTCATGAAGTCGGAATCGCTAGTA
ATCGCATATCAGAACGATGCGGTGAATACGTTCCCGGGCCTTGTACACACC
GCCCGTCACACCACGAAAGTTTGTTGTACCCGAAGTCGGTGCCTTAACCTCG
CAAGAGGAGAGAGCCGCCCAAGGTATGGCCGATGATTGGGGTGAAGTCGT
AACAAGGTAGCCGTAGGGGAACCTGCGGCTGGATCACCTCCTTTCTA

d. Sulfobacillus thermosulfidooxidans

Sulfobacillus thermosulfidooxidans strain N19-50-01 16S ribosomal RNA gene, partial


sequence

GenBank: EU499919.1
GenBank Graphics
>gi|187234288|gb|EU499919.1

GTTTGATCATGGCTCAGGACGAACGCTGGCGGCGTGCGTAATACATGCAAG
TCGAGCGGACCTTCGGGTCAGCGGCGGACGGGTGAGGAACACGTGAGTGAT
CGGGCTGTGAGTGGGGGATATCGGGCCGAAAGGCGCGGCAATCCCGCATAC
GTTCCGGGAAACCGGAAGAAAGCTTGGCAACAGGCGCTCACAGGGGAGCT
CGCGGCCCATTAGCTAGTTGGGGGGGTAACGGCCTCCCAAGGCGACGATGG
GTAGCCGGCCTGAGAGGGTGAACGGCCACACTGGGACTGAGACACGGCCC
AGACTCCTACGGGAGGCAGCAGTAGGGAATCTTCCACAATGGGCGCAAGCC
TGATGGAGCAACGCCGCGTGAGTGAAGACGGCCTTCGGGTTGTAAAGCTCT
GTCTGTCGGGACGAAGACCGGCCCGGAAGGGCCGGGGAGCCGGTACCGAC
GGAGGAAGCCCCTGCAAACTACGTGCCAGCAGCCGCGGTAAGACGTAGGG
GGCAAGCGTTGTCCGGAATTACTGGGCGTAAAGGGCGTGTAGGCGGTGCGA
TACGTAGCGGTTTTAAGCCTCCGGCTCACCCGGAGGAGGGCGGCTAAACGG
TCGCGCTAGAGGGCAGGAGAGGTGCGTGGAATTCCTGGTGGAGCGGTGAAA
TGCGTAGAGATCAGGAAGAACACCCGTGGCGAAGGCGGCGCACTGGCCTG
GCCCTGACGCTGAGGCGCGACAGCGTGGGGAGCGAACGGGATTAGATACCC
CGGTAGTCCACGCCGTAAACGATGGGTACTAGGTGTCGCCCGGGTCCACCG
GGCGGTGCCGGAGCTAACGCACTAAGTACCCCGCCTGGGGAGTACGGCCGC
AAGGTTGAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCAGTGGAGCA
TGTGGTTTAATTCGACGCAACGCGCAGAACCTTACCAGGACTGGACACGCT
CGTGAGCGCCGCGAAAGCGGCGGGCCCTTCGGGGAGCGAGCGCAGGTGCT
GCATGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACG
AGCGCAACCCTTGTCGTGTGTTGCCAGCGGTTCGGCCGGGCACTCACACGA
GACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAAATCCGCAT
GGCCTTGATGTTCTGGGCTACACACGTGCTACAATGGTCCCGACAACGGGA
TGCGACGGCGCGAGCCGGAGCCAATCCTTCAAACGGGATCTCAGTTCGGAT
TGCAGGCTGCAACTCGCCTGCATGAAGCCGGAATTGCTAGTAATCGCCCAT
CAGCATGGGGCGGTGAATTCGTTCCCGGGCCTTGTACACACCGCCCGTCAC
ACCACGAGAGTCGGCCACACCCGAAGCCGGGCGATCCAACCGCATCCGCGG
AGGGTCCCGTCGACGGTGGGGTCGGTGATTGGGGTGAAGTCGTAACAAG
GTAGCCGTA

e. Acidithiobacillus thiooxidans

Acidithiobacillus thiooxidans strain JJU-1 16S ribosomal RNA gene, partial sequence

GenBank: KM101109.1
GenBank Graphics
>gi|693261107|gb|KM101109.1|

TGCAGTCGAACGGTAACAGCGTCTTCGGATGCTGACGAGTGGCGGACGGGT
GAGTAATGCGTAGGAATCTGTCTTTGAGTGGGGGACAACCCAGGGAAACTT
GGGCTAATACCGCATAAGCCCTGAGGGGGAAAGCGGGGGATCTTCGGACCT
CGCGCTGGAAGAGGAGCCTACGTCTGATTAGCTAGTTGGTAGGGTAAAGGC
CTACCAAGGCGACGATCGGTAGCTGGTCTGAGAGGACGACCAGCCACACTG
GGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATTTTT
CGCAATGGGGGCAACCCTGACGAAGCAATGCCGCGTGAATGAAGAAGGCC
TTCGGGTTGTAAAGTTCTTTCGTGGAGGACGAAAAGGTGGGTGCTAATAAC
GCCTGCTGTTGACGTGAATCCAAGAAGAAGCACCGGCTAACTCCGTGCCAG
CAGCCGCGGTAATACGGGGGGTGCAAGCGTTAATCGGAATCACTGGGCGTA
AAGGGTGCGTAGGCGGTGCATTAGGTCTGTCGTGAAATCCCCGGGCTCAAC
CTGGGAATGGCGGTGGAAACCGGTGTACTAGAGTATGGGAGAGGGTGGTG
GAATTCCAGGTGTAGCGGTGAAATGCGTAGAGATCTGGAGGAACATCAGTG
GCGAAGGCGGCCACCTGGCCCAATACTGACGCTGAGGCACGAAAGCGTGG
GGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCCTAAACGATGAATA
CTAGATGTTTGGTGCCAAGCGTACTGAGTGTCGTAGCTAACGCGATAAGTAT
TCCGCCTGGGAAGTACGGCCGCAAGGTTAAAACTCAAAGGAATTGACGGGG
GCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGATGCAACGCGAAGAAC
CTTACCTGGGCTTGACATGTCTGGAATCCTGCAGAGATGCGGGAGTGCCCTT
CGGGGAATCAGAACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGTGA
GATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTCCTTAGTTGCCAGC
GGTTCGGCCGGCCACTCTAGGGAGACTGCCGGTGACAAACCGGAGGAAG
GTGGGGATGACGTCAAGTCCTCATGGCCTTTATGTCCAGGGCTACACACGTG
CTACAATGGCGCGTACAGAGGGAAGCCAAGCCGCGAGGTGGAGCAGACCC
CAGAAAGCGCGTCGTAGTTCGGATTGCAGTCTGCAACTCGACTGCATGAAG
TCGGAATCGCTAGTAATCGCGGATCAGCATGCCGCGGTGAATACGTTCCCG
GGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGATTGTACCAGAAGC
CGTTAGCCTAACCTTCGGGAGGGCGATGACCACGGTATGGTTCATGA

f. Leptospirillum ferriphilum

Leptospirillum ferriphilum strain UBK03 16S ribosomal RNA gene, partial sequence

GenBank: DQ534052.1
GenBank Graphics
>gi|106635983|gb|DQ534052.1|

AGAGTTTGATCGTGGCTCAGAACGAACGCTGGCGGCGTGCCTAACACATGC
AAGTCCGACGTGAAAGGGGAGCAATCCCCCGGTAGGGTGGCAAACGGGTG
AGTAAGACATGGGTGATCTGCCCTGGAGATGGGGATATCCCTCCGAAAGGG
GGGGCAATACCGAATAGTATCCGGTTCCGTGAAGGGGGCCGGGGAAAGGG
AGGCCTCTGGTACAAGCTTCCGCTCCTGGATGAGCCCATGGCCCATCAGCTA
GTTGGTAGGGTAAAGGCCTACCAAGGCGACGACGGGTAGCTGGTCTGAGAG
GACAACCAGCCACACTGGCACTGAGACACGGGCCAGACTCCTACGGGAGGC
AGCAGTGAGGAATATTGCGCAATGGGGGCAACCCTGACGCAGCAACGCCGC
GTGTGGGAAGAAGGCTTTCGGGTTGTAAACCACTTTTGCCCGGGACGAAAG
GGGGGACCTGAATAAGGTTGCCCGATGACGGTACCGGGAGAATAAGCCAC
GGCTAACTCTGTGCCAGCAGCCGCGGTAAGACAGAGGTGGCAAGCGTTGTT
CGGAGTTACTGGGCGTAAAGAGTCTGTAGGTGGTCTGTCAAGTCTTTGGTGA
AAGGCCGTGGCTTAACCATGGGAATGCCAAAGAGACTGGCAGACTGGAGG
CTGGGAGAGGGAAGCGGAATTTCTGGTGTAGCGGTGAAATGCGTAGATATC
AGAAGGAAGGCCGGTGGCGAAGGCGGCTTCCTGGAACAGACCTGACACTG
AGAGACGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCAC
GCCCTAAACGATGGGTACTAAGTGTGGGAGGGTTAAACCTCCCGTGCCGCA
GCCAACGCAGTAAGTACCCCGCCTGGGGAGTACGGCCGCAAGGTTGAAACT
CAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGTGCATGTGGTTTAATTC
GACGCAACGCGAAGAACCTTACCTGGGCTTGACATGCCGCGAGTAGGAAAC
CGAAAGGGGACCGACCGGTTCAGTCCGGAAGCGGAACAGGTGCTGCATGG
CTGTCGTCAGCTCGTGCCGTGAGGTGTTGGGTTCAGTCCCGCAACGAGCGCA
ACCCTCGCCCTCTGTTGCCACCGGGTCATGCCGGGCACTCTGAGGGGACTGC
CAGCGACAAGTTGGAGGAAGGAGAGGATGACGTCAAGTCATCATGGCCCTT
ATGCCCAGGGCCACACACGTGCAACAATGGCCGGTACAGACGGAAGCAAG
ACCGAGAGGTGGAGCAAATCCGAGAAAGCCGGTCCCAGTTCGGATTGAGGT
CTGCAACTCGACCTCATGAAGTCGGAATCGCTAGTAATCGCGTATCAGCAC
GACGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACACCACG
AAAGTCTGTTGTACCTGAAGTCGGTGCCCCAACCGGAAACGGAGGGAGCCG
CCCAAGGTATGGCCGGTAATTGGGGTGAAGTCGTAACAAGGTACCCG

g. Desulfovibrio vulgaris

Desulfovibrio vulgaris strain RL 16S ribosomal RNA gene, partial sequence

GenBank: KC462187.1
GenBank Graphics
>gi|478246069|gb|KC462187.1|

CGGGTGCTATACATGCAGTCGAGCGAATGGATTAAGAGCTTGCTCTTATGA
AGTTAGCGGCGGACGGGTGAGTAACACGTGGGTAACCTGCCCATAAGACTG
GGATAACTCCGGGAAACCGGGGCTAATACCGGATAACATTTTGAACCGCAT
GGTTCGAAATTGAAAGGCGGCTTCGGCTGTCACTTATGGATGGACCCGCGT
CGCATTAGCTAGTTGGTGAGGTAACGGCTCACCAAGGCAACGATGCGTAGC
CGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTC
CTACGGGAGGCAGCAGTAGGGAATCTTCCGCAATGGACGAAAGTCTGACGG
AGCAACGCCGCGTGAGTGATGAAGGCTTTCGGGTCGTAAAACTCTGTTGTT
AGGGAAGAACAAGTGCTAGTTGAATAAGCTGGCACCTTGACGGTACCTAAC
CAGAAAGCCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGG
CAAGCGTTATCCGGAATTATTGGGCGTAAAGCGCGCGCAGGTGGTTTCTTA
AGTCTGATGTGAAAGCCCACGGCTCAACCGTGGAGGGTCATTGGAAACTGG
GAGACTTGAGTGCAGAAGAGGAAAGTGGAATTCCATGTGTAGCGGTGAAAT
GCGTAGAGATATGGAGGAACACCAGTGGCGAAGGCGACTTTCTGGTCTGTA
ACTGACACTGAGGCGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCT
GGTAGTCCACGCCGTAAACGATGAGTGCTAAGTGTTAGAGGGTTTCCGCCC
TTTAGTGCTGAAGTTAACGCATTAAGCACTCCGCCTGGGGAGTACGGCCGC
AAGGCTGAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGAGCA
TGTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAGGTCTTGACATCCTC
TGACAACCCTAGAGATAGGGCTTCTCCTTCGGGAGCAGAGTGACAGGTGGT
GCATGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACG
AGCGCAACCCTTGATCTTAGTTGCCATCATTTAGTTGGGCACTCTAAGGTGA
CTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAAATCATCATGCC
CCTTATGACCTGGGCTACACACGTGCTACAATGGACGGTACAAAGAGCTGC
AAGACCGCGAGGTGGAGCTAATCTCATAAAACCGTTCTCAGTTCGGATTGT
AGGCTGCAACTCGCCTACATGAAGCTGGAATCGCTAGTAATCGCGGATCAG
CATGCCGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACACCA
CGAGAGTTTGTAACACCCGAAGTCGGTGGGGTAACCTTTTGGAGCCAGCCG
CATAAGGTGACAGAGGGGG

h. Acidimicrobium ferrooxidans

Acidimicrobium ferrooxidans strain DSM 10331 16S ribosomal RNA gene, complete
sequence

NCBI Reference Sequence: NR_027584.1


GenBank Graphics
>gi|228719712|ref|NR_027584.1|

AGAGTTTGATCATGGCTCAGGACGAACGCTGGCGGCGTGCCTAACACATGC
AAGTCGTACGCGGTGGCTTGCCACCGAGTGGCGAACGGGTGCGTAACACGT
GAGGAACCCACCCCGACGTGGGGGATAACACCGGGAAACCGGTGCTAATA
CCGCATGTGCTCCCCTGACCGCATGGTCGAGGGAGCAAAGCCTTCGGGCGC
GACGGGACGGCCTCGCGGCCTATCAGCTTGTTGGTGGGGTAACGGCCCACC
AAGGCGACACGGGTAGCTGGTCTGAGAGGACGATCAGCCACACTGGGACTG
AGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCGCAAT
GGGCGAAAGCCTGACGCAGGAACGCCGCGTGGAGGACGAAGGCCTTCGGG
TTGTAAACTCCTTTCAGCAGGGACGAAACTGACGGTACCTGCAGAAGAAGC
CCCGGCTAACTACGTGCCAGCAACCGCGGTAAGACGTAGGGGGCGAGCGTT
GTCCGGATTTACTGGGCGTAAAGAGCTCGTAGGCGGCTTGGCAAGTCGGAT
GTGAAATCACCAGGCTCAACCTGGTGTCGCCATCCGATACTCCATGGCTTGA
GTCCGGTAGAGGATCGTGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATA
TCAGGAGGAACACCAATGTCGAAGGCAGCGATCTGGGCCGGTACTGACGCT
GAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCA
CGCCCTAAACGTTGGGCACTAGGTGTGGGGCCTCATTCGACGGGCTCCGTG
CCGACGCTAACGCATTAAGTCCCCGCCTGGGGAGTACGGCCGCAAGGCTAA
AACTCAAAGGAATTGACGGGGGCCCGCACAAACGGCGGAGCATCGGCTTA
ATTCGATGCAACGCGAAGAACCTCACCTGGGCTTGACATGGAGGGAAAAGC
CGCAGAGATGCGGTGTCCTTCGGGTCCCTTGCACAGGTGGTGCATGGCTGTC
GTCAGCTCGAGTCGTGAGATGTTGGGTAAGTCCCGCAACGAGCGCAACCCT
TGCCCTATGTTGCCACGGGTCATGCCGGGGACTCGTAGGGGACTGCCGGAG
TTAATTCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGCCCCTTACGTC
CAGGGCTGCACGCATGCTACAATGGCCGGTACAAAGGGTCGCCAACCCGCG
AGGGGGAGCCAATCCCAAAAACCGGTCTCAGTTCGGATCGCAGTCTGCAAC
TCGACTGCGTGAAGTCGGAGTCGCTAGTAATCCCGGATCAGCACGCCGGGG
TGAATACGTTCCCGGGCTAGTACACACCGCCCGTCACACCACGAAAGTCGG
CAACACCCGAAGCGGTGGCCCAACCCGCAAGGGAGGGAGCCGTCGAAGGT
GGGGTCGGCGATTGGGGTGAAGTCGTAACAAGGTAGCCGT

2. Alinhar os genes encontrados e construir a árvore filogenética utilizando o


programa: http://www.ebi.ac.uk/Tools/msa/clustalw2/
Discuta brevemente quais informações esta ferramenta fornece (máximo
2000 caracteres com espaços).
Tal ferramenta fornece uma série de informações essenciais ao exercício da biologia
molecular, dentre elas, a filogenia entre diferentes organismos, permitindo que
correlacionemo-los e identifiquemos suas semelhanças genéticas.

3. Dadas as sequências desconhecidas abaixo, utilizar o programa


http://blast.ncbi.nlm.nih.gov/Blast.cgi para descobrir qual o organismo e
gene mais provável.

SEQUÊNCIA1:
ATGAAACCTGTGAAAACGGGAACGGTTCATCCCGTTCCTTCAGCTGCGAAA
CAATCAGGCTGGCGAGATCTGTTTTATTCAAAAAAAGCGGCGCCCTATCTGT
TTACAGCGCCATTCGTTTTATCCTTTCTCGTATTTTTTCTATACCCCATCATTA
GTGTCTTCATCATGAGCTTCCAAAGAATTTTGCCGGGAGAGGTGTCCTTTGT
CGGATTGTCTAATTATACAGCGCTAAACAACCCGACGTTCTATACCGCCCTT
TGGAATACGCTGGAATACACCTTTTGGACGCTGATCGTGCTGATTCCTGTTC
CATTGCTTCTGGCCATATTCCTGAATTCAAAGCTGGTCAAATTTAGAAATAT
ATTTAAATCAGCATTATTTATCCCGGCATTGACCTCAACCATTGTCGCGGGG
ATCATTTTTCGGCTGATCTTCGGAGAAATGGAAACGTCTCTGGCCAATTCCA
TCCTACTTAAACTCGGCTTTTCACCTCAGAACTGGATGAACAATGAACATAC
CGGCATGTTTTTGATGGTGCTGCTTGCTTCATGGAAATGGATGGGAATCAAC
ATCCTTTACTTTTTAGCAGGTTTGCAAAATGTGCCGAAAGAGCTGTACGAAG
CCGCTGATATAGACGGCGCGAATACAATGAAAAAATTTCTGCACATCACGC
TGCCGTTTCTCAAGCCTGTAACCGTATATGTGCTGACCATCAGCATCATCGG
CGGCTTCAGGATGTTTGAGGAAAGCTACGTCCTTTGGCAGAATAATTCCCCG
GGTAATATTGGTCTGACGCTTGTCGGATATTTGTATCAGCAGGGACTTGCCT

B.subtilis DNA for araABDLMNPQ-abfA operon

0.0 100%

SEQUÊNCIA 2:
GAAACATTAACAAATCTAAAACAGTCTTAATTCTATCTTGAGAAAGTATTGG
TAATAATATTATTGTCGATAACGCGAGCATAATAAACGGCTCTGATTAAATT
CTGAAGTTTGTTAGATACAATGATTTCGTTCGAAGGAACTACAAAATAAATT
ATAAGGAGGCACTCAAAATGAGTACAAAAGATTTTAACTTGGATTTGGTAT
CTGTTTCGAAGAAAGATTCAGGTGCATCACCACGCATTACAAGTATTTCGCT
ATGTACACCCGGTTGTAAAACAGGAGCTCTGATGGGTTGTAACATGAAAAC
AGCAACTTGTCATTGTAGTATTCACGTAAGCAAATAACCAAATCAAAGGAT
AGTATTTTGTTAGTTCAGACATGGATACTATCCTATTTTTATAAGTTATTTAG
GGTTGCTAAATAGCTTATAAAAATAAAGAGAGGAAAAAACATGATAAAAA
GTTCATTTAAAGCTCAACCGTTTTTAGTAAGAAATACAATTTTATCTCCAAA
CGATAAACGGAGTTTTACTGAATATACTCAAGTCATTGAGACTGTAAGTAA
AAATAAAGTTTTTTTGGAACAGTTACTACTAGCTAATCCTAAACTCTATGAT
GTTATGCAGAAATATAATGCTGGTCTGTTAAAGAAGAAAAGGGTTAAAAAA
TTATTTGAATCTATTTACAAGTATTATAAGAGAAGTTATTTACGATCAACTC
CATTTGGATTATTTAGTGAAACTTCAATTGGTGTTTTTTCGAAAAGTTCACA
GTACAAGTTAATGGGAAAGACTACAAAGGGTATAAGATTGGATACTCAGTG
GTTGATTCGCCTAGTTCATAAAATGGAAGTAGATTTCTCAAAAAAGTTATCA
TTTACTAGAAATAATGCAAATTATAAGTTTGGAGATCGAGTTTTTCAAGTTT
ATACCATAAATAGTAGTGAGCTTGAAGAAGTAAATATTAAATATACGAATG

Lactococcus lactis subsp. lactis CV56, complete genome


0.0 100%

SEQUÊNCIA 3:
ATGGCACAAGTCATTAATACAAACAGCCTGTCGCTGTTGACCCAGAATAAC
CTGAACAAATCCCAGTCCGCTCTGGGCACCGCTATCGAGCGTCTGTCTTCCG
GTCTGCGTATCAACAGCGCGAAAGACGATGCGGCAGGTCAGGCAATTGCTA
ACCGTTTCACCGCGAACATCAAAGGTCTGACTCAGGCTTCCCGTAACGCTAA
CGACGGTATCTCCATTGCGCAGACCACTGAAGGCGCGCTGAACGAAATCAA
CAACAACCTGCAGCGTGTGCGTGAACTGGCGGTTCAGTCTGCTAACAGCAC
CAACTCCCAGTCTGACCTCGACTCCATCCAGGCTGAAATCACCCAGCGCCTG
AACGAAATCGACCGTGTATCCGGTCAGACTCAGTTCAACGGCGTGAAAGTC
CTGGCGCAGGACAACACCCTGACCATCCAGGTTGGTGCCAACAACGGTGAA
ACCATTGATATCGATCTGAAACAGATCAACTCTCAGACCCTGGGTCTGGATA
CGCTGAATGTGCAGAAAAAATATGATGTGAAGAGCGAAGCGGTCACGCCTT
CGGCTACATTAAGCACTACTGCACTTGATGGTGCTGGCCTCAAAACCGGAA
CCGGTTCTACAACTGATACTGGTTCAATTAAGGATGGTAAGGTTTACTATAA
CAGCACCTCTAAAAATTATTATGTTGAAGTAGAATTTACCGATGCGACCGAT
CAAACCAACAAAGGCGGATTCTATAAAGTTAATGTTGCTGATGATGGTGC
AGTCACAATGACTGCGGCTACCACCAAAGAGGCTACAACTCCTACAGGTAT
TACTGAAGTTACTCAAGTCCAAAAACCTGTGGCTGCTCCAGCTGCTATCCAG
GCTCAGTTGACTGCTGCCCATGTGACCGGCGCTGATACTGCTGAAATGGTTA
AGATGTCTTATACGGATAAAAACGGTAAGACTATTGATGGCGGTTTCGGTG
TTAAAGTTGGGGCTGATATTTATGCTGCAACAAAAAATAAAGATGGATCGT
TCAGCATTAACACCACTGAATATACCGATAAAGACGGCAACACTAAAACTG
CACTAAACCAACTGGGTGGCGCAGACGGTAAAACTGAAGTTGTTTCTATCG
ACGGTAAAACCTACAATGCCAGCAAAGCCGCTGGTCACAACTTTAAAGCAC
AGCCAGAGCTGGCTGAAGCGGCTGCTGCAACCACCGAAAACCCGCTGGCTA
AAATTGATGCCGCGCTGGCGCAGGTTGATGCGCTGCGTTCTGACTTGGGTGC
GGTTCAGAACCGTTTCAACTCCGCTATCACCAACCTGGGCAATACCGTAAAT
AACCTGTCTTCTGCCCGTAGCCGTATCGAAGATTCCGACTACGCGACCGAAG
TTTCCAACATGTCTCGCGCGCAGATCCTGCAGCAGGCCGGTACCTCCGTTCT
GGCGCAGGCGAACCAGGTTCCGCAAAACGTCCTCTCTTTACTGCGTTAA

Salmonella enterica subsp. enterica serovar Paratyphi A str. ATCC 9150, complete
genome
0.0 100%

SEQUÊNCIA 4:
CTTCGGATGCTGACGAGTGGCGGACGGGTGAGTAATGCGTAGGAATCTGTC
TTTTAGTGGGGGACAACCCAGGGAAACTTGGGCTAATACCGCATGAGCCCT
GAGGGGGAAAGCGGGGGATCTTCGGACCTCGCGCTAAGAGAGGAGCCTAC
GTCCGATTAGCTAGTTGGCGGGGTAAAGGCCCACCAAGGCGACGATCGGTA
GCTGGTCTGAGAGGACGACCAGCCACACTGGGACTGAGACACGGCCCAGAC
TCCTACGGGAGGCAGCAGTGGGGAATTTTTCGCAATGGGGGCAACCCTGAC
GAAGCAATGCCGCGTGGATGAAGAAGGCCTTCGGGTTGTAAAGTCCTTTCG
TGGAGGACGAAAAGGTGGGTTCTAATACAATCTGCTATTGACGTGAATCCA
AGAAGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGGGGGT
GCAAGCGTTAATCGGAATCACTGGGCGTAAAG

Acidithiobacillus sp. LLS-1 16S ribosomal RNA gene, partial sequence


0.0 100%
4. Escolher uma das sequências do item 1 e construir um par de primers,
utilizando o programa: http://simgene.com/Primer3. Não se esqueça de
definir os parâmetros ideais de um primer: temperatura de anelamento,
tamanho do primer, tamanho do fragmento que será amplificado, conteúdo
CG. Conferir o primer no programa: http://blast.ncbi.nlm.nih.gov/Blast.cgi.

O par de primer abaixo foi definido através do programa Primer 3, o qual é mais
simples, porém, capaz de fornecer as informações necessárias. Foi determinado os
parâmetros de primers, como tamanho entre 18 e 25 nucleotídeos e quantidade de CG
em uma faixa de 20 a 80%. Assim, ao submeter o programa, obtemos o seguinte par de
primer:

MELHOR: running

LEFT PRIMER TCTGAAACAGATCAACTCTC


RIGHT PRIMER AATTGAACCAGTATCAGTTG

5. Seguindo a lista abaixo, entre no site http://www.ncbi.nlm.nih.gov/nuccore e


obtenha a sequência gênica de um dos operons listados, em formato
FASTA. Identifique a qual microrganismo a sequência pertence e se ele é
procarioto G(-) ou G(+) ou eucarioto.

a. X66059 – Grupo 4

Streptococcus pyogenes – Bactéria Gram Positiva

>gi|43937|emb|X66059.1| K.pneumoniae genes sorC, sorD, sorF, sorB, sorA, sorM and
sorE

CTCGCACCTGCGACGGCTCTGGCGGTTTAACCCCTGCACAAGGGTGATGCG
GAAAACGAACGGCGCTTCTTTCTTTAACCTTACATTTTTTGGTCACGGTCCC
GAGAATCGGTACCCCCGCGCCCATGCCGCGAATGAACTCATCGGTCACCGG
CTTATCGACCGTGACCAGATATTCTTTTCATGGTGCTTGCCGGCGCGCAGGA
TCTTGTTCACCAGATCGCCATGGTTGGTGAGGAAGATCAGCCCCTGAGAATC
TTTATCCAGACGGCCAATAGGGAAGATACGCTTGCTGTGGTTGACGAAATC
GACAATGTTGTCGCGCTCGCCGTCTTCCGTGGTGCTCACAATGCCGACCGGC
TTATTCAGCGCAATCAGCACCAGATCATCGCTTCACGCGGCTCAATCAACCG
TCCGTTAACTTTGACTAAATCGCCAGGCTCACCTGATCGGCAATGGTGGCGG
CTTGCCATTGATAAAGAGCTTGCCTTGTTCAATAAAGCGGTCGCTTCGCGAC
GCGAGGCGATAGCGCTTCGCTGTGTATTTATTTAATCGGGTTGATGAGTCGG
GCAGCATAGATTCTCCTGTAAAAGCGAAATATACCCTACCTTGTGGGCGAG
AAAAAAGATGTGCCTCTGCTCCGGCGCAAGGTGTGATCGACTACGCCTTTTT
TCACAAAAGTGTCCCGCTCAACGCCTGGCAACTTTGCATGATGAAATATCAC
AGAATCAGCGATTTAGCGTGCAATGCACAAAGGTGCCAACAACCCG
CTATTGATAACGCAAATGTGCAAGATTATTCGCCAGATTAGGGTGATGAAA
TAAAAGAAAATCGCTTTAAACGGTGGCTCTAATAGGGTTATATCAGATCAA
CAGCACAAACCCGCGATTTGCACAAATGTGCAGGAGCCTGCAGATGACCAT
GGAAAACAGTGACGATATCCGGTTGATTGTCAAAATCGCCCAGCTCTATTA
CGAGCAGGATATGACTCAGGCGCAAATTGCTCGCGAGCTGGGGATCTATCG
CACCACCATCAGTCGCCTGCTGAAGCGCGGCCGCGAGCAGGGCATTGTTAC
CATCGCCATCAACTATGACTACAATGAAAACCTGTGGCTCGAGCAACAGCT
CAAGCAAAAATTTGGCCTGAAAGAGGCGGTGGTTGCCAGCAGCGATGGGCT
GCTGGAAGAAGAACAGCTGAGTGCGATGGGCCAGCATGGCGCCCTGCTTGT
CGATCGGTTGCTGGAGCCAGGCGATATCATCGGTTTTTCATGGGGCCGCGCC
GTGCGTTCGCTGGTGGAGAACCTGCCGCAGCGCAGCCAGTCACGCCAGG
TGATCTGCGTCCCCATCATCGGTGGACCTTCCGGTAAACTGGAGAGCCGCTA
CCATGTGAACACCTTAACCTACGGCGCGGCAGCCAGACTGAAAGCGGAATC
CCACCTTGCCGATTTTCCAGCCCTGCTGGATAACCCGCTGATCCGCAACGGC
ATCATGCAGTCCCAGCACTTTAAAACCATCTCATCCTACTGGGACAGCCTGG
ATGTGGCGCTGGTGGGTATTGGTTCACCGGCCATTCGCGACGGCGCAAACT
GGCACGCCTTCTACGGCAGCGAAGAGAGCGACGATCTCAACGCCCGCCACG
TCGCCGGGGATATCTGTTCGCGTTTCTACGATATTAACGGCGGGTTAGTCGA
TACCAATATGAGCGAAAAAACCCTGTCGATCGAAATGGCGAAGCTGCGCCA
GGCTCGCTATTCCATCGGCATCGCCATGGGGGAAGAGAAATACTCTGGCAT
TCTTGGCGCATTGCACGGACGCTATATTAATTGTCTGGTGACAAACAGAGA
AACGGCTGAGTTATTACTGAAATAACACACAGGATATGATTTCACGCAGCA
CCCGCTGCGGGGATCCCTTTATCTAAAGAAATGGGAGTGAATAATGAATAC
CTGGTTAAATTTAAAAGATAACGTCATTATCGTGACCGGCGGCGCCTCGGG
AATTGGGCTGGCCATTGTCGATGAATTATTATCACAAGGTGCTCATGTCCAG
ATGATTGATATTCATGGCGGCGATCGTCATCACAATGGCGATAATTATCAC
TTCTGGTCGACGGATATTTCCAGCGCGACAGAGGTGCAACAGACTATCGAT
GCCATTATTCAGCGCTGGTCGCGTATTGATGGTTTGGTCAATAACGCTGGCG
TGAATTTCCCGCGTTTATTAGTCGACGAAAAAGCACCGGCCGGCCGCTATG
AATTAAACGAAGCCGCTTTTGAAAAAATGGTCAATATCAACCAGAAAGGGG
TGTTTTTCATGTCGCAGGCGGTGGCGCGTCAAATGGTCAAACAGCGCGCCG
GTGTGATTGTCAATGTCTCTTCGGAGAGCGGCCTGGAAGGCTCTGAAGGTC
AAAGCTGCTACGCCGCGACCAAGGCCGCGCTCAACAGCTTTACCCGCTCCT
GGTCCAAAGAATTGGGTAAATATGGGATCCGCGTGGTCGGCGTTGCGCCGG
GGATCCTCGAAAAAACCGGTCTGCGGACACCGGAATATGAAGAGGCGCTGG
CCTGGACGCGCAATATCACCGTCGAGCAACTTCGCGAGGGATATACCAAAA
ACGCCATTCCCATCGGGCGGGCAGGAAAACTCTCAGAAGTCGCTGATTTTG
TTTGCTATCTCTTGTCAGCGCGCGCCAGCTACATCACCGGAGTCACCACTAA
CATTGCCGGCGGAAAAACGCGCGGTTAAGGAGGCAGTATGGTTCACGCTAT
CTTCTGCGCCCACGGCCAGCTGGCCGGGGCCATGCTTGATTCGGTATGCATG
GTCTACGGCGAGGTTAACGTCAGCGCCGTCGCGTTTGTCCCCGGCGAAAAC
GCGGCGGATATCGCCATTAACCTGGAAAAGTTAGTAAGCGCCCACACCGAT
GAGGAGTGGGTAATCGCGGTAGATTTGCAGTGCGGAAGCCCATGGAATGCC
GCAGCCGGGCTGGCAATGCGTCACCCGCAGATCCGGGTGATTAGCGGCCTG
TCGCTGCCGCTGGCGCTCGAGCTGGTGGATAACCAGCATACCCTGAGCGCC
GATGACTTATGCCAGCATCTGCAGGCCATCGCCAGTCAGTGCTGTGTGGTCT
GGCAGCAGCCAGAAACCGTTGAGGAGGAGTTCTGATGCAAATTACCCTCGC
CCGTATTGATGACCGACTGATTCATGGCCAGGTCACCACCGTGTGGTCAAA
AGTCGCCAACGCCCAGCGGATAATTATCTGCAATGACGATGTATTTAACGA
TGAGGTTCGCCGGACCCTGTTGCGCCAGGCGGCTCCGCCAGGCATGAAGGT
AAACGTTGTCAGTCTGGAAAAAGCGGTTGCGGTCTATCATAACCCGCAATA
TCAGGACGAGACCGTCTTTTATTTATTTACCAATCCACACGATGTTTTAAC
GATGGTGCGCCAAGGCGTGCAGATCGCCACGTTAAATATTGGTGGCATGGC
CTGGCGACCCGGTAAAAAACAGCTAACCAAAGCCGTTTCTTTGGATCCGCA
GGATATTCAGGCATTCCGTGAACTCGATAAACTGGGCGTAAAACTCGATTT
ACGCGTGGTCGCATCAGATCCGTCAGTCAATATTCTCGACAAAATTAACGA
AACAGCTTTCTGCGAATAAAAAATAGCGCCTGTTGTGATATGCCGTAGCAG
GCAGGATGACTCACACCTTAAAGGTGCATAAATTATGGAAATTAGTACCCT
ACAGATAATAGCCATATTTATTTTTTCCTGTATTGCCGGAATGGGCAGCGTG
CTGGATGAATTTCAGACCCATCGGCCCCTTATCGCCTGTACCGTGATTGGCC
TCATCCTGGGCGATTTAAAAACCGGGGTTATGCTCGGCGGTACGCTGGAGC
TGATCGCCCTCGGCTGGATGAACGTGGGGGCAGCGCAGTCGCCAGATTCGG
CGCTGGCCAGCATTATCTCCGCCATTCTGGTGATTGTGGGCCACCAGAGCAT
TGCCATTGGTATCGCCATTGCTCTGCCGGTGGCCGCCGCCGGGCAGGTGCTG
ACCGTTTTCGCCCGTACCATTACCGTGGTGTTCCAGCACGCCGCGGACAAAG
CGGCCGAGGAGGCGCGCTTTCGCACCATCGACCTGCTGCATGTCTCCGCGCT
GGGGGTGCAGGGCCTGCGCGTGGCAATCCCGGCGCTGGTGGTCTCGCTGTT
CGTTAGCGCGGATATGGTCAGCAGTATGCTCAGCGCGATCCCGGAATTCGT
CACCCGCGGCCTGCAGATTGCCGGTGGTTTCATTGTCGTCGTGGGCTACGCG
ATGGTGCTGCGAATGATGGGCGTGAAATACCTGATGCCCTTCTTTTTCCTCG
GTTTTCTCGCCGGGGGTTATCTCGACTTCAGCCTGCTGGCCTTCGGCGGCGT
GGGCGTCATCATCGCGCTGATCTACATCCAGCTCAATCCACAGTGGCGTAA
GGCTGAACCCGCCGCCTCCACTGCCCCCTCTGCCCCCGCCCTTGACCAGCTT
GACGACTAATGGAGCCGAAAATGGAACAGAAAAAAATCACGCAAGGCGAC
CTGGTGAGCATGTTTCTCCGCTCCAACCTCCAGCAGGCCTCCTTTAACTTCG
AACGTATTCATGGCCTGGGGTTTTGCTACGACATGATCCCGGCGATCAAACG
CCTGTATCCGCTCAAAGCCGATCAGGTCGCGGCGCTGAAGCGTCATCTGGT
GTTCTTTAATACCACGCCGGCGGTGTGTGGCCCGGTGATCGCCGTCACCGCC
GCCATGGAGGAGGCCCGGGCTAACGGCGCGGCCATTGACGATGGCGCTATC
AACGGCATCAAAGTGGGTCTGATGGGCCCGCTGGCCGGCGTCGGCGACCCG
CTGGTCTGGGGCACGCTGCGGCCGATTACTGCGGCCCTTGGCGCCTCGCTGG
CCCTCTCCGGCAACATTCTGGGACCTTTGTTATTTTTCTTTATTTTCAATGCA
GTGCGGCTGGCAATGAAGTGGTACGGCCTGCAACTGGGCTTCCGTAAAGGG
GTCAATATCGTCAGCGATATGGGCGGTAACCTGCTGCAGAAGCTGACCGAA
GGCGCCTCCATTCTCGGCCTGTTTGTCATGGGGGTGCTGGTGACCAAATGGA
CCACCATCAATGTGCCGCTGGTGGTTTCCCAAACGCCCGGTGCAGACGGCG
CCACCGTCACGATGACCGTCCAGAACATCCTCGATCAGCTCTGCCCCGGCCT
GCTGGCGCTGGGCCTGACGCTACTGATGGTGCGTCTGCTGAACAAGAAAGT
GAATCCGGTCTGGCTGATTTTCGCCCTTTTCGGCTTAGGCATTATCGGCAAC
GCCCTCGGCTTTCTGTCCTGATTATTCCGCCCGGCGATGCCGGGGTGACGTC
AACAACATAGCCCCGAAAGATGAAGGGGATGAGGTGGTTTATGCAAACAA
CAACGGCCCTGCGCCTGTATGGCAAACGAGACCTGCGCCTGGAAACCTTTA
CCCTCCCGGCGATGCAGGACGATGAGATCCTCGCCCGGGTGGTCACGGACA
GCCTGTGTCTTTCCTCATGGAAAGAGGCCAATCAGGGTGCCGATCATAAAA
AGGTGCCAGATGATGTGGCCACCAGGCCCATTATTATCGGTCATGAATTCTG
CGGCGAAATCCTTGCCGTCGGTAAAAAGTGGCAGCATAAGTTTCAGCCAGG
GCAGCGCTACGTGATCCAGGCCAACCTGCAGTTGCCCGACCGACCCGACTG
CCCCGGCTACTCATTCCCATGGATCGGTGGCGAAGCCACCCATGTGGTGATC
CCCAATGAGGTGATGGCGCAGGATTGCCTGCTCACCTGGGAGGGGGATACC
TGGTTTGAAGGATCGCTGGTGGAGCCGCTCTCCTGCGTCATTGGCGCTTTCA
ACGCCAATTATCATCTGCAGGAGGGGAGTTACAACCACGTGATGGGGATCC
GTCCGCAGGGACACACTCTGATCCTGGGCGGGACGGGGCCGATGGGGCTGC
TGGCTATCGACTATGCGCTGCACGGCCCCATCAATCCTTCACTACTGGTAGT
GACCGATACCAATAAGCCGAAGCTCAGCTACGCCCGCCGTCATTACCCCTCT
GAGCCGCAGACGCTGATCCACTACCTCGACGGCCATGAGGCCAGTCGCGAT
ACGCTGCTGGCGCTCAGCGGCGGCCATGGCTTCGACGATATCTTCGTGTTTG
TGCCAAACGAACAGCTGATCACCCTGGCCTCCTCGTTGCTGGCTCCGGACGG
CTGCCTGAATTTCTTTGCTGGCCCCCAGGATAAGCAATTCAGCGCCCCGATC
AACTTCTACGACGTGCACTACGCCTTCACCCACTACGTCGGCACCTCCGGCG
GTAATACTGACGATATGCGCGCCGCGGTGGCGCTGATGCAGGCGAAAAAGG
TCCAGACGGCGAAAGTGGTCACCCATATTCTCGGCCTGAACGCCGCGGGCG
AAACCACCCTCGATCTGCCTGCCGTCGGCGGGGGAAAAAAGCTGGTGTATA
CCGGAAAAGCCTTCCCGCTCACGCCGCTGGGCGAGATCGCCGATCCCGAAC
TGGCAGCGATTGTGGCGCGTCACCATGGGATCTGGTCCCAGGAGGCTGAAG
CGTATCTGCTGGCCCACGCGGAGGATATTACGCATGATTAACCGCGATACG
CTGCTGTGCATCTCCCTGGCGGGTCGCCCCGGCAACTTCGGCACCCGCTTTC
ATAACTATCTGTATGACAAGCTGGGATTGAACTACCTCTACAAAGCCTTTAC
GACTAAGGATATTGAGGCGGCGGTAAAAGGGGTTCGCGCATTGGGTATCCG
CGGCTGTGCGGTCTCCATGCCGTTTAAA

b. X79837 – Grupo 4

Escherichia coli – Bactéria Gram Negativa

>gi|599737|emb|X79837.1| E.coli (EC3132) gat genes Y, Z, A, B, C, D and R'

CTGCAGGCGGTGAAACTGACCTCAGCGATGCAGTGCGTACTGCGGTTATGA
ACAAACGCGCTGGCGGAATGGGGCTGATTCTTGGACGTAAGGCGTTCAAGA
AATCGATGGCTGACGGCGTGAAACTGATTAACGCCGTGCAGGACGTTTATC
TCGATAGCAAAATTACTATCGCCTGATACGCCTTATACGAACAGCACTACCC
CGTATGTCGGAAAAGGCCGTTTACGTCGCATCCGGCATAAAAACACGCGCA
CTTTGCTACGGCTTCCCTATCGGGAGGCCGTTTTTTGCCTGTCATTCTGATTA
ATTTTCGAATTGTCGTTTTTGTGATCGTTATCTCGACATTTAAAACAAATAAT
TTCATTATATTTTGAAATCGAAAATAAACGACAGGATATGAAAATGTACGT
GGTATCGACAAAGCAGATGCTGAACAACGCACAGCGCGGCGGTTATGCGGT
TCCGGCATTCAATATTCACAATCTCGAAACGATGCAAGTGGTGGTAGAAAC
CGCTGCCAACCTGCATGCGCCGGTCATCATCGCCGGAACGCCTGGCACATTT
ACTCATGCTGGTACAGAAAATCTGTTGGCGCTGGTCAGCGCGATGGCGAAG
CAATATCACCATCCACTGGCAATTCATCTCGACCATCACACGAAATTTGACG
ATATCGCTCAGAACCTTCGTTCTGGCGTGCGCTCAGTCATGATTGACGCCTC
GCATTTGCCTTTTGCGCAAAATATTTCACGGGTCAAAGAGGTGGTGGATTTT
TGCCATCGCTTTGATGTCAGCGTCGAAGCGGAGCTGGGGCAACTTGGCGGC
CAGGAAGATGATGTGCAAGTCAATGAAGCCGATGCTTTTTATACCAACCCC
GCTCAGGCGCGTGAATTTGCCGAGGCAACCGGAATTGATTCCCTGGCGGTC
GCCATCGGCACGGCTCATGGTATGTATGCCAGCGCACCGGTGCTTGATTTTT
CTCGACTGGAGAACATTCGCCAGTGGGTGAACTTACCGCTGGTGCTGCATG
GCGCGTCAGGGTTATCGACTAAGGATATTCAGCAAACCATCAAACTGGGGA
TATGCAAAATCAACGTTGCAACGGAGCTGAAAAATGCCTTCTCGCAGGCGT
TAAAAAATTACCTGACCGCGCACCCCGAAGCGACCGATCCCCGGGATTATT
TGCAGTCGGCTAAATCCGCAATGCGCGATGTGGTGAGCAAAGTGATTGCCG
ATTGTGGCTGCGAGGGCAGGGCATAACGCACCTGCCATTTAACAAGGAAAA
AACATGAAAACGTTAATTGCCCGGCATAAAGCTGGTGAACATATCGGCATA
TGCTCAGTCTGTTCTGCCCATCCGTTGGTTATCGAAGCGGCGCTGGCATTTG
ATCGCAACAGCACGCGCAAAGTGCTGATTGAAGCAACGTCAAACCAGGTCA
ATCAATTTGGCGGTTATACCGGAATGACACCGGCAGACTTTCGCGAATTTGT
TTTTACGATTGCCGATAAAGTCGGATTTGCTCGTGAGCGTATTATTCTCGGC
GGCGACCATCTGGGGCCAAACTGCTGGCAGCAAGAAAATGCGGATGCGGC
GATGGAAAAATCCGTCGAGCTGGTAAAGGCATATGTTCGTGCCGGCTTCAG
TAAAATTCATCTTGATGCGTCAATGTCCTGCGCGGGGGATCCCATACCGTTA
GCACCAGAAACGGTTGCGGAACGAGCTGCTGTGCTTTGCTTTGCTGCGGAA
AGTGTGGCGACAGATTGCCAGCGTGAGCAACTGAGCTATGTCATTGGCACC
GAAGTTCCGGTTCCGGGCGGTGAGGCCAGCGCCATTCAGTCAGTACACATT
ACCCGGGTTGAAGATGCCGCCAATACTTTACGTACGCATCAAAAGGCCTTT
ATTGCCCGTGGGCTGGCAGAGGCGTTAACACGCGTGATTGCCATTGTGGTG
CAGCCAGGCGTGGAATTTGATCACAGCAATATTATCCATTATCAGCCGCAG
GAAGCACAGCCGCTGGCGCAATGGATAGAAAACACCCGAATGGTTTATGAA
GCACATTCTACCGATTACCAGACCCGGACGGCTTATTGGGAATTAGTCCGCG
ATCACTTTGCAATATTGAAAGTCGGTCCCGCATTAACCTTTGCTTTACGTGA
GGCGATATTTGCGCTGGCGCAAATTGAGCAGGAACTTATCGCCCCAGAAAA
TCGCAGCGGTTGCCTGGCGGTAATTGAAGAAGTGATGTTTGACGAACCGCA
ATACTGGAAAAAATATTATCGTACGGGTTTTAACGATTCATTACTGGATAT
TCGTTACAGCCTGTCGGATCGTATTCGTTATTACTGGCCGCATAGTCGGATT
AAAAATAGCGTCGAAACGATGATGGTGAATCTGGAAGGCATGGAAATCCTC
TGGCATGATTAGTCAGTATCTTCCTAAACAATTTGAACGCATTCAGTCCGGG
AATTATCAGCAATACCGCATCAGTTGATTATGGATAAAATTTATGATGTTTT
GCGCGCCTATCGCTACGGCTGTGCGGAATAAGGACGGTATATGACTAACCT
GTTTGTTCGTAGCGGAATTTCTTTTGTCGATCGTAGCGAAGTTTTAACCCAT
ATCGGTAATGAGATGCTCGCGAAAGGTGTGGTTCATGATACATGGCCACAG
GCATTAATTGCCAGAGAAGCAGAATTCCCTACCGGGATAATGCTTGAGCAG
CACGCCATTGCAATACCGCATTGTGAGGCGATTCATGCTAAGTCGTCAGCCA
TTTATCTGTTAAGGCCAACAAATAAAGTTCATTTTCAGCAAGCGGATGATGA
TAACGACGTGGCGGTATCGTTGGTTATTGCGTTAATTGTGGAAAATCCGCAG
CAGCAATTGAAACTTTTACGCTGTTTATTTGGCAAGTTACAACAGCCCGAGA
TCGTCGAGACACTAATCACTCTTCCTGAAACCCAGTTAAAGGAATACTTCAC
AAAGTATGTTTTAGATTCAGACGAATAAATCCCTCTGTAACAATAAAAAGG
ACTATTTATGAAACGCAAGATTATTGTCGCTTGCGGAGGCGCGGTTGCGACC
TCTACGATGGCGGCGGAAGAAATTAAAGAGTTGTGTCAGAGTCATAATATT
CCTGTTGAATTAATCCAGTGTCGGGTTAATGAAATAGAAACCTATATGGATG
GCGTGCATTTGATATGCACCACTGCCAGGGTGGATCGAAGTTTTGGCGATAT
TCCGTTAGTTCACGGCATGCCTTTTGTTTCTGGTGTCGGTATCGAAGCATTAC
AAAATAAAATTCTGACTATCTTACAGGGGTGACCTATGTTTTCAGAAGTCAT
GCGTTATATTCTCGACCTCGGCCCTACGGTGATGCTGCCAATTGTCATCATT
ATTTTTTCTAAAATATTAGGCATGAAGGCAGGCGATTGCTTTAAAGCGGGTC
TGCATATCGGGATTGGCTTTGTTGGCATTGGCCTTGTGATTGGCTTAATGC
TGGATTCCATTGGTCCGGCGGCGAAAGCGATGGCGGAAAATTTCGACCTGA
ATCTGCATGTGGTCGATGTCGGCTGGCCGGGCTCTTCACCAATGACCTGGGC
GTCGCAAATTGCGCTGGTGGCGATTCCGATTGCGATTCTGGTTAACGTGGCG
ATGCTACTGACCCGTATGACGCGGGTGGTAAATGTTGATATCTGGAATATCT
GGCATATGACCTTCACCGGCGCATTGCTGCATCTGGCAACCGGTTCATGGAT
GATAGGGATGGCGGGTGTGGTAATTCACGCGGCGTTTGTTTATAAGCTCGG
CGACTGGTTTGCCCGCGATACCCGAAATTTCTTTGAGCTGGAAGGCATTGCT
ATTCCGCACGGTACGTCGGCGTATCTGGGGCCAAATTGCGGTGCTGGTGGA
TGCTATCATCGAGAAAATCCCAGCGTTAACCGTATTAAATTTAGCGCCGACG
ATATTCAGCGCAAATTTGGTCCGTTTGGCGAGCCTGTCACCGTGGGTTTTGT
GATGGGGCTGATTATCGGCATCCTCGCGGGTTACGATGTCAAAGGTGTATTG
CAGCTGGCGGTAAAAACGGCGGCAGTTATGCTGTTGATGCCACGGGTGATT
AAACCCATCATGGATGGTTTAACGCCTATCGCTAAGCAGGCGCGTAGTCGTT
TACAGGCGAAGTTCGGCGGCCAGGAGTTCCTGATTGGCCTGGATCCCGCAT
TACTGCTGGGGCATACGGCGGTGGTATCGGCAAGCCTGATTTTTATCCCGCT
CACCATTTTAATTGCTGTTTGTGTGCCGGGTAATCAGGTGCTGCCGTTTGGC
GATCTTGCAACTATCGGCTTCTTTGTGGCGATGGCAGTTGCCGTGCATCGTG
GAAATCTGTTCCGCACCTTAATCTCGGGTGTCATCATTATGAGCATCACCCT
GTGGATCGCGACGCAAACTATTGGTTTGCACACCCAACTGGCGGCTAAT
GCTGGGGCGTTAAAAGCCGGGGGTATGGTGGCTTCAATGGATACAGGGCGG
TTCTCCCATTACCTGGTTACTGATTCAGGTTTCTCCCCGCAAATATTCCCGGT
TTCATTATTATCGGCGCAATTTATCTGACCGGTATTTTCATGACCTGGCGTA
GAGCGCGTGGTTTTATTAAACAAGAGAAAGCCGTTCTCGCAGAATAATTTTT
ACCTGAGGGGGAGGTAATCCCCCTTAACAATCAGGAGTTTTTATGAAATCA
GTGGTGAATGATACTGATGGTATCGTGCGCGTTGCAGAAAGCGTCATTCCTG
AAATTAAACATCAGGATGAGGTGCGGGTAAAAATTGCCAGCTCGGGATTAT
GTGGTTCCGATTTACCCAGAATTTTTAAAAATGGTGCACATTATTATCCAAT
AACGTTAGGCCATGAATTTAGCGGCTATATTGATGCTGTGGGATCCGGTGTT
GATGATTTACATCCTGGCGATGCGGTTGCCTGTGTGCCGTTATTACCCTGTTT
TACTTGTCCAGAGTGTTTGAAAGGGTTTTATTCCCAGTGCGCAAAATATGAT
TTTATTGGCTCGCGGCGTGATGGTGGATTTGCTGAATATATTGTCGTTAAGC
GAAAAAATGTCTTTGCTCTACCCACGGATATGCCTATTGAGGATGGGGCTTT
TATTGAACCGATTACCGTTGGCCTGCATGCTTTTCATTTAGCGCAAGGTTGT
GAGAATAAAAACGTTATTATTATTGGTGCAGGAACCATTGGCCTGCTGGCT
ATTCAGTGCGCTGTCGCGCTGGGAGCAAAGAGTGTGACGGCTATCGACATT
AGCTCAGAAAAACTGGCACTGGCAAAATCTTTCGGTGCGATGCAAACATTT
AACAGCCTTGAAATGAGCGCGCCGCAAATGCAGGGCGTTTTACGCGACGTG
CGCTTTAATCAGCTTATCCTCGAGACGGCTGGTGTCCCGCAAACCGTCGAAC
TGGCGGTAGAGATTGCCGGGCCTCATGCCCAGCTGGCGCTGGTGGGCACGT
TGCATCAGGATCTGCATTTAACATCGACAACGTTTGGCAAAATATTACGTAA
AGAGCTGACGGTTATCGGCAGCTGGATGAACTACTCCAGCCCTTGGCCGGG
GCAGGAGTGGGAAACGGCGAGCCGGTTGCTGACAGAACGTAAGTTAAGCCT
GGAGCCATTAATCGCTCACCGTGGAAGCTTTGAAAGCTTCACCCAGGTGGT
GCGTGACATCGCTCGTAATGCTATGCCGGGCAAAGTGTTGCTCATTCCCTGA
AACCGCGGGCCAGCGTGATGCTGGCCCGGTATTGTGCAAAACAGATCATTC
ACCAATGGTCCCCCTTCGTTTACACTAGCCACAATTGAATATGGGTAAATGA
CGATGAATTCATTCGAGCGAAGGAATAAGATCATCCAATTAGTGAATGAAC
AGGGAACCGTGCTTGTTCAGGATCTGGCGGGAGTATTTGCTGCCTCGGAAG
CGACAATCCGTGCCGATTTGCGCTTTCTCGAACAAAAAGGCGTGGTTACGC
GCTTTCATGGCGGTGCGGCGAAAATAATGTCCGGTAATAGTGAAACC

6. Obtida a sequência do operon escolhido identifique as ORFs utilizando o


programa: http://www.ncbi.nlm.nih.gov/projects/gorf/
Escolha umas das ORFs identificadas pelo programa e busque se a mesma
corresponde a uma proteína, em caso afirmativo, a qual é a proteína mais
provável?

Procuramos na sequência X79837 uma ORF, a qual está representada


abaixo. Ao procurarmos no programa BLAST, identificamos que essa ORF é de
uma proteína, denominada Sugar-DH, da superfamília MDR com multidomínio
PKR10309.

Das könnte Ihnen auch gefallen