Bio101 2 DnaSequencing

DNA Sequencing
INTRODUCTION Automated DNA sequencing is a core research tool used by almost every research biochemistry lab. It is used to determine the sequence of DNA, or the genetic code, that serves as the blueprint of life for every organism on Earth. Nucleic acid sequencing is a relatively late arrival for the sequencing of biological macromoleculesfor up until the late 1980s protein sequencing was the primary tool for obtaining coding information found in the molecules of life. Protein sequencing is a slow and expensive endeavor, and it could easily take a year or more to sequence a protein of 500 amino acids. Today the sequence of a protein can be determined from DNA analyses in just a few days. Because of the straightforward and repetitive nature of the procedure, the sequencing itself is typically performed in centralized facilities where automated machines carry out the reactions and data analysis. GENOMIC DNA Since it was discovered that DNA is the material in the cell that carries our genetic information, understanding DNA has become a primary focus of genetic research. Our chromosomes, or genome, consist of neatly wound strands of DNA. All living organisms, from bacteria to human beings, contain DNA in each of their cells. Each cell contains the entire genetic code for that organism. DNA consists of just four building blocks, or nucleotides. These four building blocks, known by their abbreviations A, T, G and C are used as the alphabet to write our genetic code. All the instructions needed to build our bodies are encoded using just these four letters. Genomes come in a variety of sizes. Viruses, which cannot live without a host cell, have the smallest genomes, while higher-order organisms such as plants and animals have genomes that are billions of bases long. Like genomes, individual genes can vary greatly in size, from several hundred bases to millions of bases . The average human gene is about 3,000 bases long, although only about 1,0002,000 bases actually encode protein. These protein-encoding stretches of DNA are called exons. Introns, which are intervening stretches of DNA that are not fully understood, make up the rest of the gene. The largest human gene, dystrophin, a muscle protein implicated in muscular dystrophy, is 2.4 million bases in length. Viral and
bacterial DNA sequences, which do not contain introns, are typically the shortest genes. The dideoxy DNA sequencing procedure was invented by Frederic Sanger and his colleagues in 1977. With a few improvements, this method is still used today. This elegant procedure, which can be fully automated, allows large sequencing centers to read over 1,000 bases of DNA sequence per second, a feat which now allows scientists to sequence even large genomes within the span of years, rather than decades. First, DNA has to be extracted from the cells of the organism being studied. The sequencing reaction is then performed on the DNA, and the sequenced DNA strands are sorted by size using capillary electrophoresis. Finally, the DNA code is read by a computer, which displays the data for scientists to use. DNA PREPARATION Before it can be sequenced, DNA needs to be purified from cells. First, the cells and their nuclei are broken open. This can be accomplished by mechanical methods, such as grinding, or by chemical methods that break apart cell membranes. The DNA floating around in this soup is still coated with protective proteins. The DNA can be selectively removed from this soup by precipitating it and DNA-binding proteins can be cleaned away. Very large pieces of DNA, such as whole chromosomes or genomes, are cut into smaller pieces and stored in vectors (plasmids), which are larger pieces of DNA with the ability to be reproduced when placed in host cells such as bacteria. Bacteria containing a vector are placed in culture medium, where they multiply a million-fold or more. Each time a bacterium divides, the DNA vector placed inside is also copied. In this way, the target DNA can be multiplied exponentially. Each of the copied DNA pieces is called a clone. SEQUENCING REACTION The sequencing reaction itself consists of four steps, which will be covered in detail in this section. First, the double-stranded DNA is separated into single strands, and a small starter piece of DNA called a primer binds to one of the strands, called the template strand. In the extension step, a new DNA strand is made that is complementary to the template strand. Starting at the primer, DNA polymerase uses the template
strand as a guide to recreate the second DNA strand. The termination step is the key to the sequencing reaction. Strand extension is halted by the incorporation of a dye-labeled terminator nucleotide, which identifies the base at the position where strand extension stopped. When many strand termination reactions are performed together, each of the bases in a DNA strand can be identified. STRAND SEPARATION Double-stranded DNA needs to be denatured, or separated into single strands, before it can be sequenced. This process is accomplished by heating the DNA, which disrupts the hydrogen bonds and VanDerWaals forces that hold the two chains of DNA together in a double helix. PRIMER ANNEALING Next, a small single-stranded DNA piece of about 20 bases, called an oligonucleotide, is annealed to the denatured template strand. This oligonucleotide is needed to prime the next step, DNA extension. In theory, the two DNA strands that were separated in the preceding step could just snap back together. This is avoided by using a rapid cooling process, which gives the small nucleotides an advantage over long DNA strands in annealing. In addition, a large excess of primers is used to again ensure that the primers will outcompete the complementary DNA strand for annealing to the template. The oligonucleotide primer must be of complementary sequence to the template strand in order to bind by base-pair interactions. PRIMER EXTENSION During the extension phase, a bacterial DNA polymerase enzyme begins assembling a new DNA chain from the individual nucleotide building blocks, or dNTPs, provided in the reaction mixture. The nucleotides are added in the order specified by the complementary bases in the template strand. DNA polymerase cannot start copying a template strand without a small piece of DNA to start the extension process. This is why the primer was added in the previous step. CHAIN TERMINATION The reaction mixture also contains small amounts of each of the 4 dideoxynucleotides, or ddNTPs, which lack the 3'-hydroxyl group necessary for
chain extension. Whenever a dideoxynucleotide is incorporated into a growing DNA chain, it terminates chain growth because another nucleotide cannot be attached to it. Each of the four ddNTPs is labeled with a different dye, which can later be detected using a special laser. The DNA polymerase occasionally incorporates a labeled dideoxynucleotide into a growing DNA strand. This doesnt happen very often, because the concentration of dideoxynucleotides is much lower than the concentration of dNTPs in the reaction mixture. The ratio of dNTPs to ddNTPs is carefully balanced to get just the right number of chain termination events. PUTTING IT ALL TOGETHER An actual sequencing reaction mixture contains thousands of DNA template strands, which are all being sequenced simultaneously. Simply by chance, some annealed primers will only be extended a few nucleotides before the chain extension is terminated by the addition of a ddNTP. However, other primers will form a long chain of DNA before a ddNTP is incorporated. Thus, there will be a population of DNA strands in the reaction, some very short, some very long, and every possible length in between. To further increase the yield of sequenced strands, the sequencing reaction is performed in a thermal cycler, which cycles through the heating and cooling steps dozens of times, in effect repeating the sequencing reaction many times in one experiment. CAPILLARY ELECTROPHORESIS The newly synthesized DNA strands, each labeled with one of four dyes, are now sorted by length using capillary electrophoresis. First, the reaction mixture is heated to keep the newly synthesized single strands from annealing with the template DNA strand. The dye-labeled single strands are loaded onto a tiny capillary tube containing a viscous, gel-like material. An electrical current pulls the negatively charged DNA strands through the capillary. This tube is not much thicker than a human hair and is 1 to 3 feet long, sufficient to separate strands that differ in length by only one base. Because of the small dimensions involved, preparation of the capillary and loading of the sample are computer controlled. Shorter DNA strands migrate through the gel material more quickly, and come out the bottom of the capillary first, while longer strands become tangled in the gel material and take longer to emerge out the bottom. As the strands emerge out the bottom of the capillary they pass through a laser
beam that excites the fluorescent dye attached to the dideoxynucleotide at the end of each strand. This causes the dye to fluoresce, or glow, at a specific wavelength, or color. This color is then detected by a photocell, which feeds the information to the computer. COMPUTER ANALYSIS The computer displays the information received from the photocell as an electropherogram, which is a tracing of signal received by the photodetector in each of the four wavelengths. Although the real colors seen by the photodetector are close to green, yellow, orange and red, the computer assigns false colors to each of the four tracings to make it easier to tell them apart. It also prints the letter of the appropriate base below each of the signal peaks. Because successive peaks correspond to DNA segments differing in length by one nucleotide, the sequence of peaks reveals the sequence of bases in the original DNA sample. PROCEDURE SUMMARY Today, dideoxy sequencing is the method of choice to sequence very long strands of DNA. DNA is purified from the cells of the organism of interest, and placed into cloning vectors, which allow the DNA to be multiplied by bacterial hosts. Each clone is then individually sequenced. This method can be done manually, or can be be fully automated, depending on how much DNA needs to be sequenced. The DNA is denatured and a small DNA oligonucleotide is annealed at one end of the sequence of interest on the template strand. The DNA polymerase extends the oligonucleotide, using the template strand to guide incorporation of nucleotides. Once in a while, a dideoxynucleotide will be incorporated into the growing DNA strand. Because it is missing the 3 hydroxyl group, the dideoxynucleotide will prevent the DNA chain from being extended further. In addition, each dideoxynucleotide has a different color label. Consequently, each terminated DNA chain is colored according to the nucleotide at its end. When the chains are separated by length by capillary electrophoresis, individual chains of increasing length can be identified by their color. A laser at the bottom of the capillary excites the fluorescent labels as they come out of the capillary. The fluorescence color then tells the computer which base is represented, and the computer records each base, one by one on a graph called an electopherogram. CONCLUSION
Overlapping sequence data from many clones is analyzed by powerful computers, which regenerate the full-length sequence by piecing the short sequences together like a puzzle. Entire genomes can be sequenced in this manner. Genome sequencing projects, representing many different organisms, hold the promise of unprecedented advances in industry and medicine. Microbial genomes may encode enzymes that could help make industrial processes more efficient. Human genome sequences are helping us to better understand human metabolism and disease and may make it easier to treat genetic diseases or design better drugs in the future.

Bio101 2 DnaSequencing

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Bio101 2 DnaSequencing

Hochgeladen von

Copyright:

Verfügbare Formate

DNA Sequencing

Das könnte Ihnen auch gefallen