Sie sind auf Seite 1von 92

National training on Allele Mining

12 25 September 2011



LABORATORY MANUAL







INDIAN INSTITUTE OF SPICES RESEARCH
(INDIAN COUNCIL OF AGRICULTURAL RESEARCH)
CALICUT 673 012, KERALA

Published by

Dr. M. Anandaraj
Director







Organized by

Dr. Johnson K. George (Course Director)
Dr. Santosh J. Eapen
Dr. Prasath (Course Coordinator)








Compiled & Edited by

Dr. Johnson K. George
P. R. Rahul
A. Chandrasekar
















The manual is an in-house publication intended for training purposes only and is not for public
circulation.



Copyright 2011 IISR. All rights reserved. Reproduction and redistribution prohibited without approval.

CONTENTS
Sl.No Title Page No
1 RNA/DNA isolation 1
2 Reverse Transcription-PCR (RT-PCR) 4
3 Gel Elution Techniques 7
4 Cloning of PCR Amplified DNA (T/A cloning) & Bacterial Transformation 11
5 Plasmid isolation and restriction digestion 14
6 Sequence analysis 17
7 Agarose Gel Electrophoresis 19
8 Denaturing Polyacrylamide Gel Electrophoresis (PAGE) for nucleic acids 21
9 Silver staining of DNA Polyacrylamide gels 24
10 NBS Profiling 28
11 EcoTILLING 32
12 Promoter Mining 35
13 Tools for Genetic Diversity Analysis 38
14 RAPD and ISSR Analysis 48
15 Microsatellite (simple sequence Repeats) profiling 51
16 Multilocus Sequence Typing of bacteria 56
17 Rolling circle amplification-RACE (RCA-RACE) 64
18 Protocols in development and analysis of mutants for functional genomics 67
19 Quantitative RT-PCR 73
20 Loop mediated isothermal amplification (LAMP) 75
21 Two Dimensional Gel Electrophoresis 78
22 Bioinformatics -data mining tools, Identification of microsatellite sites, EST analysis
and annotation
83
23 Sequence - Based Marker Designing 87
Annexures
I General Conversion Tables and Formulae 88
II Gene tagging steps 89
III Bioanalyzer and Off Gel Fractionator 101


National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 1

DNA/ RNA Isolation


Introduction
Any molecular biology work is basically done using the genetic material of an organism,
either DNA or RNA. Thus the isolation of a good quality DNA/RNA is essential to the
success/failure of any experiment. The main role of DNA molecules is the long-term storage of
information in the form of triplet codons containing the instructions needed to construct other
components of cells, such as proteins and RNA molecules. The DNA segments that carry this genetic
information are called genes, but other DNA sequences have structural purposes, or are involved in
regulating the use of this genetic information. Ribonucleic acids (RNA) are crucial molecules in the
central dogma of life and perform vital functions in both structural and functional roles. RNA
molecules form the bridge between the stable genetic information contained within DNA and
enzymes and proteins that carry out much of the metabolism within the cell. Many of the sites of
protein synthesis, the ribosomes within the cell, are composed of these ribonucleic acids as tRNA
molecules that deliver the amino acid building blocks to the ribosomes. Of all the RNA species, the
nucleic acid intermediate, messenger RNA, is a desirable source of material to biologists, since this
reflects much of, what ultimately, is translated into enzymes and proteins. High quality RNA is the
starting material to study the qualitative and quantitative changes in mRNA expression, in- vitro
translation, RNase protection assay, reverse transcriptase - polymerase chain reaction (RT-PCR) and
cloning. The gene specific primers can be designed based on sequence information available at the
NCBI database and can be used for the isolation of genes using RT-PCR.
1.1. DNA Isolation by modified CTAB method (Ausubel et al., 1995)
The protocol used for extraction of DNA from Piper leaf tissues is as follows,
1. Grind 5 g of young leaves in liquid nitrogen with a mortar and pestle and add 25 ml of
preheated (65C) CTAB buffer. Add 0.2% -Mercaptoethanol prior to use.
2. Incubate at 60C for 30 minutes.
3. Extract with equal volume of chloroform: isoamyl alcohol (24:1) at 10,000 rpm for 10
minutes at room temperature.
4. Take the aqueous phase and add 2/3 rd volume of ice-cold isopropanol.
5. Incubate at -20C for 2 hours and centrifuge (10,000 rpm, 15 minutes at 4C).
6. Discard the supernatant and invert the tube on paper towel for few minutes.
7. Dissolve the pellet and add 1.5 ml of TE buffer at room temperature over night.
8. Add 10 g/ml of RNase A and incubate at 37C for 30 minutes.
9. Add equal volume of Tris saturated phenol, mix it well and centrifuge at 10,000 rpm for ten
minutes.
10. To the aqueous phase add equal volume of phenol: chloroform: isoamyl alcohol,
(25:24:1), shake and centrifuge at 10,000 rpm for ten minutes.
11. Take the aqueous phase and add equal volume of chloroform: isoamyl alcohol (24:1), shake
and centrifuge at 10,000 rpm for ten minutes.
12. To the aqueous phase add one-tenth volume of 3M sodium acetate (pH 5.2) and 2.5 volumes
of ethanol and incubate at -20 for one hour or at -70
0
C for 30 minutes.
13. Centrifuge at 10,000 rpm for 10 minutes and wash the pellet in 70% ethanol (10,000 rpm for
5 minutes).
14. Air dry the pellet and dissolve in 200 l TE and estimate the yield.
1.1.2. Quantification of DNA
The amount of DNA present in the sample is estimated using UV
spectrophotometer/biophotometer/nanodrop etc which are all basically measuring the OD at 260nm.
DNA shows a clear absorbance peak at 260 nm and the value of 1.0 OD
260
is calculated equivalent to
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 2

50 g/ml. DNA solution was considered pure if the value of OD
260
: OD
280
is 1.8. Visualize the DNA
on (0.8%) agarose gel for its quality. Store the DNA at -20
0
C until further experiment.
1.2. RNA isolation using TRI-Reagent
TRI Reagent is a mixture of guanidine thiocyanate and phenol in a mono-phase solution,
which effectively dissolves RNA, DNA and protein on homogenization or lysis of tissue sample.
After adding chloroform and centrifuging, the mixture separates into 3 phases: an aqueous phase
containing the RNA, the interphase containing DNA and an organic phase containing proteins. Each
component can be isolated after separating the phases. One ml of TRI Reagent is sufficient to isolate
RNA, DNA and protein from 50-100 mg of leaf tissue.
This is one of the most effective methods for isolating total RNA and can be completed in
one hour starting with fresh tissue. The procedure is very effective for isolating RNA molecules of all
types from 0.1 to 15 kb in length. The resulting RNA is intact with little or no contaminating DNA
and protein. This RNA can be used for northern blots, mRNA isolation, in- vitro translation, RNase
protection assay, cloning and reverse transcriptase - polymerase chain reaction (RT-PCR).
Materials required
Sterile powder free nitrile gloves, refrigerated centrifuge, vortex, autoclavable polythene
covers, DEPC treated and autoclaved microfuge tubes, microtips, pestle and mortar.
Reagents required
TRI Reagent (Sigma), chloroform, iso-propanol, 75% ethanol prepared using DEPC treated
and autoclaved water, DEPC treated and autoclaved water or RNA re-suspension solution (Ambion)
to dissolve the RNA pellet, RNaseZAP.
Steps in RNA isolation
1. Grind 100mg leaf sample to fine powder using liquid nitrogen, transfer it to 1.5 ml DEPC treated,
sterile microfuge tube and add 1ml of TRI Reagent.
2. Shake vigorously for homogenous mixing of TRI reagent with the sample and keep the sample at
4C until all the samples are homogenized.
3. Incubate the samples at room temperature for 5 min, so as to ensure complete dissociation of
nucleoprotein complexes and release of RNA, mediated by guanidine thiocyanate and phenol
present in the TRI Reagent.
4. Centrifuge the samples at 12,000 rpm for 10 min at 4C. In this step, all the insoluble materials
such as cellular debris, extra cellular membranes and high molecular weight DNA (>20kb) and
most of polysaccharides are sediment at bottom of the microfuge tubes. The RNA, low molecular
weight DNA and protein are in supernatant.
5. Carefully transfer the supernatant to a fresh microfuge tube and add 200l of
chloroform for every 1 ml of TRI Reagent used in the sample preparation.
6. Shake vigorously for 15 s and incubate at room temperature for 5-10 min.
7. Centrifuge at 12,000 rpm for 15 min at 4C. The centrifugation separates the mixture into 3 phases:
a red organic phenol phase containing protein, an inter-phase containing DNA and a colorless
upper aqueous phase containing RNA.
8. Transfer the supernatant containing RNA to a fresh microfuge tube and precipitate the RNA by
adding 500 l of iso-propanol. Incubate the samples at room temperature for 5 min.
9. Centrifuge at 12000 rpm for 15 min at 4C to pellet the RNA.
10. Decant the supernatant and wash the pellet with 75 % ethanol prepared with DEPC treated sterile
water.
11. Centrifuge at 12000 rpm for 10 min at 4C to pellet the RNA.
12. Air dry the pellet for 10 min and dissolve the RNA with 50 l of DEPC treated water.
13. Check the quality of RNA in 1 % agarose gel.
14. Quantify the RNA in a spectrophotometer (260/280 nm), the 260/280 ratio should be 1.9 to 2.2,
which indicate the good quality of RNA.
Quantify the RNA using the following formula:
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 3

RNA in g/l = (40 x Dilution factor x Absorbance at 260)/1000
*A260/280 ratio should equal to 2, indicating little or no contamination of protein and
polysaccharides however, because of variation in starting materials and individual practices, the
expected ratio ranges from 1.7-2.2.
*A260/280 ratio lower than 1.7, the RNA should be purified again. In most cases this is due to
protein contamination and occurs when the aqueous phase is collected and some organic phase
comes with it.
Quality of RNA
The quality and integrity of RNA is judged by the intactness of the 25S and 18S ribosomal
RNA bands in an agarose gel (1.5%).
Notes and Precautions
1. Treat the plastic wares (micro-tips, micro-centrifuge tubes, pestle, mortar and other necessary items
with 0.1% Diethyl pyrocarbonate for overnight and autoclave it for 2 hours.
2. Use separate pipettes for RNA work.
3. Plastic gloves or powder free nitrile gloves should be worn at all times during isolation and
handling of RNA to avoid contamination of samples with RNases.
4. Perform RNA isolation in dust free environment.
5. The use of RNAzap

to wipe the surfaces as well as pipettes is recommended


to inactivate RNases.
6. Keep all the kit components tightly sealed when not in use. Tubes with RNA
should be tightly closed during enzymatic reactions.
7. Use certified reagents, including high quality water (DEPC-treated water etc.).
8. Guanidine isothiocyanate (in Trizol), a strong protein denaturant capable of dissolving most cell
constituents, dissociate nucleoprotein and release RNA.
9. Polysaccharides form a whitish gel-like pellet. If the tissue used contains a high level of
polysaccharides.
10. RNA pellet should be white. The presence of an off-white, gel like pellet indicates contamination
by polysaccharides.
11. Because of the naturally occurring polysaccharides and polyphenols that are released during cell
disruption, they form a complex with nucleic acids during tissue extraction and co-precipitate
during subsequent alcohol/isopropanol precipitation steps. Depending on the nature and the
quantity of these contaminants the resulting alcohol precipitate can be gelatinous and difficult to
dissolve.
12. Organic solvents such as phenol can dissociate RNA from the protein and exploiting the
difference in hydrophobicity between RNA and protein can separate them by generating two
phases.
The isolated RNA can be used for RT- PCR amplification using gene specific primers either
for isolating ESTs or for diagnostics.

National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 4

1.2. Reverse Transcription-PCR (RT-PCR)


Polymerase Chain Reaction
The polymerase chain reaction (PCR) is a primer-mediated enzymatic amplification of
specifically cloned or genomic DNA sequences. It was invented two decades ago and has
revolutionized molecular biology research worldwide. The template DNA contains the target
sequence, which may be tens of thousands of nucleotides in length. A thermostable DNA polymerase
such as Taq DNA polymerase catalyzes the buffered reaction in which an excess of an
oligonucleotide primer pair and four deoxynucleoside triphosphates (dNTPs) are used to make
millions of copies of the target sequence.
PCR Process
Three fundamental steps defines one PCR cycle.
1. Double-stranded DNA template denaturation.
2. Annealing of two oligonucleotide primers to the single-stranded template and
3. Enzymatic extension of the primers produces copies that can serve as templates in subsequent
cycles.
RT-PCR (Reverse Transcription-Polymerase Chain Reaction)
In biochemistry, a reverse transcriptase, also known as RNA-dependent DNA polymerase, is
a DNA polymerase enzyme that transcribes single-stranded RNA into single-stranded DNA. Normal
transcription involves the synthesis of RNA from DNA, hence reverse transcription is the reverse of
this. It was discovered by Howard Temin at the University of Wisconsin-Madison, and independently
by David Baltimore in 1970. The two shared the 1975 Nobel Prize in Physiology or Medicine with
Renato Dulbecco for their discovery.
Commonly used reverse transcriptase enzymes include:
1. M-MLV reverse transcriptase from the moloney murine leukemia virus
2. AMV reverse transcriptase from the avian myeloblastosis virus
Reverse transcription Process
Reverse Transcription (RT reaction) is a process in which single-stranded RNA is reverse
transcribed into complementary DNA (cDNA) by using total cellular RNA or mRNA, a reverse
transcriptase enzyme, a primer, dNTPs and an RNase inhibitor. The resulting cDNA can be used as
template in PCR. RT reaction is also called as first strand cDNA synthesis. Traditionally RT-PCR
involves two steps: the RT reaction and PCR amplification. RT-PCR can also be carried out as one-
step RT-PCR in which all reaction components are mixed in one tube prior to starting the reactions.
Although one-step RT-PCR offers simplicity, convenience and minimizes the possibility for
contamination, the resulting cDNA cannot be repeatedly used as in two step RT-PCR. Three types of
primers can be used for RT reaction: oligo (dT) primers complimentary to the poly A tail of mRNA,
random (hexamer) primers and gene specific primers with each having its pros and cons.
Protocol:
Reverse Transcription (20 l reaction)
1. In a 0.2 ml thin walled PCR tube, prepare the following reaction mix on ice.
Total RNA (10ng-5g) : variable
oligo(dT)
18
primer (0.5g/l) : 1 l
DEPC treated water : upto 12 l
Mix gently and spin in a microfuge briefly.
2. Incubate the mixture at 70 C for 5 min to denature the RNA secondary structure and
immediately chill on ice to maintain it in the same condition. Spin in a microfuge briefly.
3. Place the tube on ice and add the following components:
5X RT reaction buffer : 4 l
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 5

Ribonuclease inhibitor (40U/l) : 0.5l
10mM dNTP mix : 2 l
M-MuLV RT enzyme (200U/l) : 0.25 l
RNase free water : up to 20 l
Mix gently and spin in a microfuge briefly
4. Incubate the mixture at 42 C for 60-90 min to anneal and extend the primers.
5. Incubate at 70 C for 10min to inactivate the enzyme and chill on ice.
6. Store the cDNA synthesis reactions at -20 C.
PCR amplification with Taq DNA polymerase (25 l reaction)
1. Use only 10% of the cDNA synthesis reaction (2l) for PCR and proceed the reaction using
Taq polymerase enzyme.
2. Add the following components to the PCR tube
Sterile water : 16.67 l
10X Taq PCR buffer : 2.5 l
10mM dNTP mix : 2.5 l
10M Primers (forward) : 0.5 l
10M Primers (reverse) : 0.5 l
Taq polymerase enzyme (3U/l) : 0.33 l
Template DNA (here cDNA) : 2 l
3. Mix gently, centrifuge briefly and perform 40 cycles of PCR with optimized conditions for
the sample.
4. Carry out the reaction in a thermal cycler for 40 cycles with following specifications.
Step 1 : Denaturation (95 C) : 10 min
Step 2 : Denaturation (94 C) : 1 min
Primer Annealing (XX C) : 1 min
Primer Extension (72 C) : 1 min
Step 2 repeat for 40 cycles
Step3 : Final primer extension (72 C) : 15 min
At the end set the thermal cycler to hold at 4 C
Run the PCR products on 1.5 % agarose gel stained with ethidium bromide and visualize the
samples under the UV light. The specific product is further used for elution and cloning for other
downstream applications.
Notes and Precautions
RNase contamination is always a concern when working with RNA. Both the laboratory environment
and all solutions have to be free of RNase.
General recommendations to avoid RNase contamination are as follows:
1. Follow the recommendations for preventing RNase contamination, as in section 1.1 (notes
and precautions).
2. Use an RNase inhibitor to stabilize RNA.
3. Always assess the integrity of RNA prior to cDNA synthesis. If sharp bands of both the plant
18S rRNA and the 25S rRNA are formed during denaturing agarose gel electrophoresis of
total eukaryotic RNA, the mRNA in the sample is considered to be intact.
Troubleshooting
No product
1. Make sure all the components have been thawed completely/mixed and added to reaction mix.
2. Check the integrity of RNA template.
3. Check the annealing and incubation temperatures in the RT step.
4. Check the quality of oligo(dT)
18
primer against other lots or sources.
5. Increase the number of cycles (by increments of 5)
6. Increase the amount of template RNA.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 6

7. Repeat the experiment with freshly isolated RNA and new consumables, which have not been
opened and used before.

No specific product/High background
1. Reduce the number of cycles
2. Reduce the volume of the RT reaction mix added to the PCR reaction in the two step protocol.
3. Increase the incubation time of the RT step.
4. Increase the time of the elongation cycle of the PCR step, but do not increase the time of
extension per cycle in the long range PCR program.
Low yield
1. Increase the amount of RT enzyme.
2. Increase the amount of template RNA.
3. Increase the volume of cDNA added in the PCR (Max. 4l ;two step protocol)
4. Increase the number of cycles (max. 40)
5. Increase the final dNTP concentration in the one step RT-PCR reaction mix upto a maximum of
500 M.

National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 7

1.3. Gel Elution Techniques


DNA electrophoresed through agarose or polyacrylamide gels has got various downstream
applications. It is used as a primary or re-amplification template for the PCR, hybridization,
sequencing, ligation and other molecular techniques. Commercially available agarose is not
completely pure. It contains various impurities which affect the migration of DNA and the ability of
DNA recovered from the gel to serve as substrate in enzymatic reactions. Hence it is essential to use
special molecular biology grade agarose for gel elution, which is free of nucleases and inhibitors.
Different methods in gel extraction
Different methods are available for extracting DNA from an agarose gel. The method, we
employ depend on the consumables available in lab and the yield/purity of the DNA after extraction.
S.No Technique Method Remarks
1 Electroelution a) The gel fragments are placed inside a
dialysis bag along with electrophoresis
buffer and electroeluted, the trapped
DNA can be recovered by precipitation.
b) A small trough is cut ahead of the
migrating DNA band and
electrophoretically eluted onto
diethylaminoethyl (DEAE)-cellulose
paper, dialysis tubing, affinity
membrane or into a space in the gel
containing 0.3 M sodium acetate pH
6.0, 10 % sucrose.
Gel must be
visualized with UV
and constantly
monitored to ensure
collection of the
sample.
2 Freeze and squeeze
method
Freeze the gel piece in liquid nitrogen
within a micropipette tip or centrifuge
tube and spin out the liquid by
centrifugation.
DNA quality is not
assured.

3 Crush and soak
method
Add buffer to the agarose slice and
squash with a glass rod. The slurry is
placed at 37C and centrifuged through
siliconized glass wool or non-toxic
polyallomer fibers.
Co-elution of
impurities.
4 Resin binding Bind the DNA to silica particles by
using commercially available binding
resins, diatomaceous earth or glass
fibers.
Low yield.
5 Spin techniques Place the gel slice within a microfuge
tube containing a membrane with a
small pore size and spin.
Co-elution of
contaminants.
6 Enzyme method Heat the gel slice to 65C, lower the
temperature to 40C, and add GELase.
Agarose is degraded into multimeric
DNA fragments
purified in this way
could not be
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 8

subunits. enzymatically
labeled by nick
translation.
7 Syringe squeeze Place the gel piece in the syringe
containing glass wool at its mouth and
squeeze with the piston.
DNA may not be
sufficiently purified
from contaminating
agarose or buffer
components for
further
manipulations.
8 Resin binding
coupled with spin
techniques
Bind the DNA to silica particles by
using resins and spin in a microfuge
tube containing a membrane with a
small pore size and spin.
Commercially
exploited and widely
being in use.

Typically, methods involving organic solvents, electroelution, or binding of the DNA to silica
particles or ion-exchange resins give quite pure DNA, but yields are relatively low. On the other
hand, high-yield techniques tend to be problematic in enzyme reactions.
Purification of PCR products using spin column
Agarose gel electrophoresis is ideal for the separation of 200bp to 10kb PCR fragments and
for fragments smaller than 200bp, polyacrylamide gel is preferred. Resin binding coupled with spin
columns are ideal for downstream cloning applications as they save time and comparatively give
higher recovery (80%). Reuse of agarose may affect the quality of the eluted DNA, hence use of fresh
agarose gel is recommended for DNA extraction. Agarose melts at a temperature greater than the
melting temperature of DNA. Now a days low melting point (LMP) agarose is available in which the
introduction of hydroxyl ethyl groups into the polysaccharide chain causes the agarose to gel at
approximately 30C and to melt at approximately 65C which is well below the melting temperature
of dsDNA. LMP agarose prevents double strand denaturation and gives higher yield than the use of
normal agarose.
DNA fragments of interest are extracted from slices of an agarose gel by solubilizing the gel.
The gel solubilisation solution contains chaotropic agent like guanidine thiocyanate which lowers the
melting point of the gel thereby preventing the sample from reaching the melting temperature. The
molten gel is added to a silica column. The adsorption of DNA to the membrane is efficient only at
pH 7.5. Other impurities flow through the spin column. The resin is washed with 70% ethanol to
remove unwanted materials bound to the column. At neutral pH, with addition of water or TE, DNA
gets eluted from the silica column.
Protocol
1. Excise band
Excise the band of interest with a sterile scalpel blade.
Note: If possible, set the trans-illuminator to long-wavelegth UV (or low-power) and minimize the
time of exposure. This is because the UV mutagenises the DNA at a measurable rate. It is
good to trim off as much empty agarose as possible.
Place the excised band in a fresh microcentrifuge tube.
2. Weigh gel
Weigh the gel slice, using an empty tube to tare the balance.
Add 3 volumes of binding buffer or solubilization buffer. The binding buffer has the
chaotropic salt.
3. Solubilize gel
Incubate at 50C for 10 min (or until the gel slice has completely dissolved). To dissolve the
gel, vortex the tube for every 23 min during the incubation.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 9

Note: Solubilize agarose completely. For >2% gels, increase incubation time and the volume of
solubilization buffer.
4. Add isopropanol
After the gel slice has dissolved completely, add one gel volume of isopropanol to the sample
and mix by inverting the tube several times.
5. Prepare column
Place the column in a 2 ml collection tube
Add 500 l of Column Buffer to the spin column and centrifuge for 1 min
Note: The column preparation solution maximizes the binding of DNA to the membrane resulting in
more consistent yields
Discard the flow-through and place the column back in the same collection tube.
6. Bind DNA
Load the solubilized gel solution to the column, and centrifuge at 13,000 rpm for 1 min.
Discard the flow-through and place the column back in the same collection tube.
For maximum recovery, transfer all traces of sample to the column. The maximum volume of
the column reservoir is 700 l. If the sample volumes is more than 700 l, repeat step 6 using
700l fractions.
7. Wash column
Wash the column with 600 l of wash buffer and centrifuge at 13,000 rpm for 1 min.
Discard the flow-through, place the column back in collection tube and centrifuge the column
at 13,000 rpm for 1 min to remove any trace of wash buffer.
Note: Residual ethanol from buffer will not be completely removed unless the flow-through is
discarded before this additional centrifugation.
8. Elute DNA
Place the column into a fresh 1.5 ml microcentrifuge tube.
Elute the DNA with 20 l of Elution Buffer (10 mM TrisCl, pH 8.5) or sterile MilliQ water.
Add to the center of the membrane, let the column stand for 1 min, and then centrifuge at
13,000 rpm for 1 min.
Note: Ensure that the elution buffer is dispensed directly onto the center of the membrane for
complete elution of bound DNA. The average eluate volume is 9 l from 10 l elution buffer
volume. Elution efficiency is dependent on pH. The maximum elution efficiency is achieved
between pH 7.0 and 8.5. When using water, make sure that the pH value is within this range,
and store DNA at 20C as DNA may degrade in the absence of a buffering agent.
9. Quality checking
Check both the quality and quantity of the eluted samples by electrophoresing them in an
agarose gel (0.8-2% depending on the samples). A single sharp band of required base pair
without any streaks ensures good quality DNA. The sample can be quantified by comparing
intensity of the sample with that of Mass ruler bands.
Alternatively, the quality and quantity of the sample can also be checked by using UV
spectrophotometer. Read the absorbance at 260nm and 280nm
Purity of the DNA = A
260
/A
280


The higher the ratio, the more pure the DNA sample. It is acceptable to have a ratio between
1.8 and 2.0 for A
260
/A
280
.

Concentration of DNA (g/ml) = Absorbance at 260nm x 50 x dilution factor
10. Downstream application
The eluted fragments are now ready for various downstream applications like cloning,
radio/non-radioactive labeling, hybridization and sequencing.
Troubleshooting:
Problem Reason Solution
Poor or low Ratio of gel solubilization Use a ratio of 3:1, for agarose gels >2%,
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 10

recovery solution to gel is incorrect. use 6:1.
Agarose gel is incompletely
solubilised.
Check that the incubation temperature is
50-60C.
The pH of the electrophoresis
buffer is too high, resulting in
inefficient binding.
Use fresh electrophoresis buffer. Check
the colour of the gel solubilization
solution. If it is purple to brown, add
10l 3M sodium acetate buffer.
Wash solution did not contain
ethanol.
Check that ethanol was added to wash
solution and the container sealed
tightly.
The wrong volume of elution
solution was used.
Use 30-50 l elution buffer or water.
Check that it completely covers the
membrane.
A gelatinous precipitate formed
after the addition of
isopropanol.
Agarose was not dissolved prior to
adding isopropanol. Incubate until the
precipitate is completely dissolved.
Poor performance
in downstream
applications
The eluate contains too much
salt.
Incubate the column for 5 min after
adding wash solution, then spin.
Residual ethanol eluted with the
DNA.
Re-centrifuge the column for 2 min
after the wash step.
Eluate is contaminated with
agarose gel.
The gel slice was incompletely
solubilized. Add 500l of gel
solubilization solution to the binding
column, incubate for 1 min and
centrifuge. Continue washing and
elution.


National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 11

1.4. Cloning of PCR Amplified DNA (T/A cloning)


Successful cloning of polymerase chain reaction (PCR)-derived DNA fragment is a key step
for further analysis of the amplified DNAs. It most often relies on ligation of prepared insert with
plasmid followed by transformation in competent E. coli cells. Traditionally, DNA ligation reaction
conditions for cloning inserts in plasmid vectors vary according to the type of DNA termini that are
being ligated. The vast majority of vector-insert cloning reactions involve ligation of the following
types of DNA termini:
24 bp sticky end overhangs
Blunt termini
T/A single base overhang
In our experiment third strategy will be followed to clone the PCR amplified product.
Principle
This cloning strategy is based on the principle that the polymerase enzymes like Taq, Tth, Tfl
and other DNA polymerases adds adenine to PCR product termini. These enzymes which lack in the
3'-5' exonuclease activity has been exploited in T/A cloning. The plasmid vector that has been used
for cloning has been pre-cleaved with an appropriate restriction enzyme and treated with terminal
deoxynucleotidyl transferase to create 3'-ddT overhangs at both ends. The 3'-ddT overhangs prevent
recircularization of the vector during ligation, resulting in high cloning yields. Thus we have a PCR
fragment with 3'-dA overhangs which gets ligated into the vector (having 3'-ddT overhangs), creating
a circular molecule with two nicks. The circular product can be used directly to transform E.coli cells
with high efficiency. The DNA insert can be readily excised from the versatile polylinker of
pTZ57R/T and subcloned into other vectors, as well as sequenced using standard M13/pUC primers.

Protocol
Major steps involved are:
I. Ligation
II. Transformation
III. Analysis of recombinant clones
InsTAclone PCR Cloning Kit (Fermentas, USA) will be used for direct one-step cloning of
our PCR-amplified DNA fragments. We will be using only one fourth of the recommended reaction
volume mentioned in the kit.
I. Ligation (Day 1)
Reagents provided with InsTAclone cloning kit.
TransformAid
TM
T-Solution A and B
TransformAid
TM
C-Medium
Vector pTZ57R/T
5X Ligation buffer
T4 DNA Ligase(5U/l)
Nuclease free water

1. Calculate volume of PCR fragment required for ligation reaction with
0.0375g (0.045 pmol ends) using the following formula

Size of the PCR fragment X 0.000045
Concentration of PCR product (g/l)
= l(X) of PCR fragment (0.135 pmol ends (i.e.) 1:3 vector insert ratio)
required for ligation reaction
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 12

(Where 0.000045 is g DNA required for 0.135 pmol ends from a 1bp size fragment)
2. Set up the ligation reaction in 0.2 ml PCR tube on ice as follows
Vector PTZ57R/T (0.0375g, 0.045 pmol ends) : 0.75 l
5x Ligation Buffer : 1.5 l
DNA ligase enzyme (5U/l) : 0.30 l
PCR fragment (0.135 pmol ends) : X l
Nuclease free water : up to 7.5 l
3. Incubate at 22C for 2 hr (For maximum yield of useful recombinants, the reaction time can
be extended to overnight). Store at 4 C after incubation.
Note: Inoculate LB agar plate (without ampicillin) using a loop full of glycerol stock culture of E.coli
strain DH5 by streak plate method for transformation work.*
II. Transformation (Day 2)
Pre-Preparations:
Luria bertani (LB) agar medium
- Weigh 4 g of ready made LB-Agar powder and dissolve in 100ml of d.H
2
O in a 250 ml conical
flask, plug with cotton and autoclave for 20 min.
- After autoclaving allow the solution to cool down to ~55C, then add 100l of ampicillin stock
solution (50mg/ml) for a final concentration of 50g/ml.
- Mix without producing air bubbles and pour 20-25ml of the medium to each plate.
- Let the medium to solidify completely (it will take ~30min)
- Spread 40l each of IPTG and X-gal from stock solutions on the surface of the medium evenly.
- Warm the plates at 37C for at least 20 min before use.
LB broth
- Dissolve 2.5g of ready made LB-Broth powder in 100 ml of d.H
2
O transfer 3ml aliquots in to
25ml screw cap culture tubes and autoclave for 20 min.
- Before use add 3l of ampicillin stock solution to each tube.*
Stock solutions
Ampicillin (50 mg/ml):
- Dissolve 50 mg of ampicillin in 1 ml of sterile MilliQ H
2
O and store at - 20C
after use.
X-Gal (20 mg/ml):
- Dissolve 20 mg of X-Gal (5-bromo-4-chloro-3-indolyl--D-galacto pyranoside, Fermentas) in 1
ml of N, N-Dimethyl formamide and the tube containing the solution should be wrapped in
aluminium foil to prevent damage by light and should be stored at -20C.
IPTG (24 mg/ml):
- Dissolve 24 mg of IPTG (Isopropyl--D-thiogalacto pyranoside)in 1 ml of sterile MilliQ H
2
O and
store at (-) 20C after use.
Competent cell preparation and Transformation
1. Aliquot 1.5ml of TransformAid C-medium in sterile 2ml culture tubes (one tube is sufficient
for 3 transformations) and pre-warm at 37 C.*
2. Move three to four well grown, individual colonies (~ 4 x 4mm size) from the overnight LB
plate into the pre-warmed C-medium using an inoculating loop.*
3. Incubate tubes at 37 C for 2 hrs with vigorous shaking (~180rpm).
4. Prepare TransformAid T-solution by mixing equal volumes of T-solution A and T-Solution B
(420 l of T-solution for each of 3 transformations). Mix well and keep on ice.
5. Transfer 1.5 ml of 2 hr culture into a 1.5ml micro centrifuge tube and keep on ice for 5 min.
6. Spin at 12,000 rpm for 1 min at 4 C to pellet cells.
7. Discard the medium and resuspend cells in 300l of T-solution.
8. Incubate the tubes on ice for 5 min.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 13

9. Spin at 12,000 rpm for 1min at 4 C and resuspend the cells in 120l of T-solution. Keep the
tubes on ice for 5 min.
10. Meanwhile transfer 2.5l of the ligation mixture to a fresh labeled PCR tube and keep on ice
for 5 min.
11. Add 40l of the resuspended cells (from step 9) to each tube containing ligation mixture, mix
gently and incubate on ice for 5 min.
12. Plate the cells on pre-warmed LB-Ampicillin-X-Gal-IPTG agar plates using sterile spreader.*
13. Incubate the plates at 37 C overnight.
III. Analysis of recombinant clones
Back streaking (Day 3)
1. From each plate select 4-5 well isolated ,white colonies and back streak in a fresh pre-
warmed LB-Ampicillin-X gal-IPTG agar plate.*
2. Incubate at 37C for overnight.
Note: Colonies that contain active -galactosidase (i.e., non recombinants) are pale blue in the center
and dense blue at their periphery. White colonies (having recombinant plasmid) occasionally show a
faint blue spot in the center, but these are colorless at the periphery.
Colony PCR confirmation (Day 4)
1. Observe the back streaked colonies, omit colonies looking bluish and mark 1 or 2 white
colonies for colony PCR.
2. Dispense 10 l of sterile MilliQ water and 2.5 l of 10x PCR buffer into fresh, labeled 0.2 ml
PCR tubes.
3. Transfer a small portion of the selected colony to the PCR tube (of step 2) using sterile
micropipette tip and mix thoroughly by pipetting up and down several times. Keep the tubes
on ice.*
4. Prepare PCR mix as follows(on ice) :
dNTP mix (2.5mM) : 2.0l
Forward Primer : 0.5l
Reverse Primer : 0.5l
MilliQ H
2
O : 9.2l
Taq DNA polymerase (3U/l) : 0.3l
5. Add 12.5l of PCR mix into each tube (of step 3), mix gently, spin briefly in a microfuge and
transfer tubes to PCR machine.
6. After PCR, run amplified products along with molecular weight marker in 1.8 % agarose gel
and verify its size.
7. Select colonies which give PCR product of desired size (same size as that of the insert).
Note: Inoculate 3 ml of pre-warmed LB broth-ampicillin medium (in 25ml screw cap culture tube)
with a small portion of a selected colony and incubate at 37 C with shaking (180 rpm) overnight for
plasmid preparation.


National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 14

1.5. Plasmid isolation and restriction digestion


1.5.1. Plasmid mini preparation
This protocol is designed for plasmid mini preparation using GenElute Plasmid Miniprep Kit
(Sigma). Here the bacterial cells are harvested by centrifugation, subjected to a modified alkaline-
SDS lysis procedure and the DNA adsorbed onto silica in the presence of high salts. Contaminants are
then removed by a simple wash step. Bound DNA is eluted in water or Tris-EDTA buffer.
Reagents provided with GenElute Plasmid Miniprep Kit
Spin column assembly
Resuspension solution
Lysis solution
Neutralization solution
Column preparation.
Wash buffer concentrate.
RNase A
Pre-Preparation
1- Resuspension solution: Add 78 l of Rnase A solution to the given resuspension solution
prior to initial use, and store at 4 C.
2- Wash solution: Dilute the wash solution concentrate with 100 ml of 100% ethanol prior to
initial use.
Plasmid Purification
1. Transfer overnight grown bacterial cells into 1.5 ml micro centrifuge tube and pellet cells by
centrifugation at 13,000 rpm for 1 min at RT.
2. Decant media, add remaining culture and spin again at 13,000 rpm for 1min at RT.
3. Decant media completely by inverting tubes on a tissue paper.
4. Resuspend the bacterial pellet with 200 l of resuspension solution by vortex.
5. Add 200 l of lysis solution, invert gently to mix and allow to clear for 5 min.
6. Add 350 l of neutralization solution and mix by inverting 4-6 times.
7. Spin at 13,000 rpm for 10 min at RT to pellet the debris.
Column Preparation
1. Add 500 l of column preparation solution to binding column in a collection tube.
2. Spin at 13,000 rpm for 1 min at RT and discard the flow-through.
3. Now the column is ready for DNA binding (from step 7 of plasmid purification)
8. Transfer the clear lysate to the binding column.
9. Spin at 13,000 rpm for 1min at RT and discard the flow-through.
10. Add 750 l of wash solution, spin for 12,000 rpm for 1 min and discard the flow-through.
11. Again spin at 13,000 rpm for 1 min at RT to dry the column and now transfer the column to a
fresh collection tube.
12. Add 30 l of sterile MilliQ water to the centre of the column and allow it to stand for 1 min at
RT.
13. Spin at 13,000 rpm for 1min at RT and collect the eluted plasmid DNA.
14. Check the concentration of plasmid DNA by running 2 l aliquot in 1.5 % agarose gel.
1.5.2. Restriction Digestion of plasmid DNA
Restriction enzyme selection is made based on the restriction map (Fig.1) of the vector pTZ57R/T
and the presumption that the selected sites for restriction does not occur within our DNA insert.

Labo


























Fig.1

1
2
3
A
do
Trou
S.No
1.









2.
Natio
ratory Manu
.Restriction m
. Set up the
10 X Restr
Sterile Mil
Hind III (1
Xba I (10U
Plasmid D
. Mix gently
. Run the di
After confirma
ownstream ap
uble shootin
P
Few or no









High backg
onal training o
ual
map of vector
restriction re
riction enzym
lliQ H
2
O (up
10U/l)
U/l)
NA (~200ng/
y and incubate
gest in 1.8 %
ation by restr
pplications.
g:
Problem
transformant
ground of non
on Allele Mi

pTZ57R/T.
action as follo
me buffer (Tan
to 25l)


/l)
e at 37 C for
agarose gel a
riction digesti
s
n
(i) Poo
cells

(ii) Un

(iii) Us
with p

(i) Low
concen

ining 12
th
- 2

ows:
ngo) : 2.5
: 11.
: 0.5
: 0.5
: 5.0
r 2 h and heat
and check for
ion, the rema
Possible cau
r quality of co
successful lig
se of DNA po
proof-reading
w antibiotic
ntration
25
th
Sept, 201

5 l
.5 l
5 l
5 l
0 l
t inactivate the
r the release o
aining plasmi
use
ompetent
gation
olymerase
g activity.
P
u
P
c
P
p
U
&
u
11, IISR, Cali

e enzymes at
of desired size
d DNA can b
Re
Perform test tr
using control p
Perform test li
ontrol PCR D
Perform PCR
polymerase
Use freshly pr
& store it at - 2
use.
icut

65 C for 10
e of the insert
be used for fu
emedy
ransformation
plasmid DNA
igation using
DNA fragmen
with Taq DN
repared antibi
20C after ea
15
min.
t.
further
n
A
nt.
NA
iotic
ch
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 16








3.






4.
recombinants






Low quality plasmid DNA





Low plasmid DNA yield

(ii) Ratio of vector to PCR
fragment is too high.
(i) The bacterial pellet may
not have been fully
resuspended.



(i) The plasmid did not
propagate


(ii) The cell resuspension
was incomplete.

(iii) The lysate was not
incubated long enough.

Adjust ratio to optimal.

Proper resuspension of the
bacterial pellet is critical for
the removal of cellular
contaminants. Vortex bacterial
pellet for at least 30 seconds.
Make sure that the appropriate
antibiotic was included during
all stages of growth.
Vortex bacterial pellet for at
least 30 seconds. Check for
homogeneous solution with no
apparent cell clumps.
Make sure that the lysate is
incubated for at least 5 min
(not to exceed 5 min).
Note: Steps marked with * should be carried out in the laminar air flow chamber only.

National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 17

1.6. Sequence analysis


The plasmids isolated from positive clones will be subjected to sequencing at commercial
sequencing facility (1
st
base, Selangor Darul Ehsan, Malaysia). The plasmids will be sequenced by
performing single pass sequencing using ABI PRISM 377 DNA sequencer, using BigDye Terminator
Cycle Sequencing Kit v3.0 / v3.1 (Perkin Elmer). The primer M13F (-20) will be used for sequencing
for single pass reaction while, M13R will be used in case of bidirectional sequencing, where the
target sequence are >600bp.
Sequence editing
The sequences received from sequencing firm will be in two formats viz., chromatogram and
notepad. The sequences will be searched for primer binding, using the find option present in the
chromatogram itself or after importing in to a word/text document and search will be performed using
control+F option. In both these methods, we will be unable to find out the primer binding positions
unless the chromatograms containing sequences perfectly matched with the primer sequences. In that
situation the raw sequences will be aligned using clustalW multiple sequence alignment tool along
with forward primer used in the PCR amplification. In some instances, primer binding site was not
found using all the three methods, such sequences are converted to reverse antisense strand and then
the primer binding site was matched using any of the above said methods. Similarly, the reverse
primer binding site will be identified. After identifying the primer binding sites, sequences flanking
the forward and reverse primer binding sites will be trimmed. Thus complete sequences for our target
insert will be obtained in this manner and their deduced amino acid sequences will obtained by
performing protein translation. Online tool like VecScreen has been used in some cases, where in it
was difficult to locate the exact location of our primer binding sites in any of the sequences.


NCBI Blast search for sequence similarity
After editing, NCBI nucleotide sequence blast search will be performed with the edited
sequences. In blast search, results will be displayed in three forms viz., graphical view, hit table
followed by pairwise alignment. Selected sequences were imported in FASTA format to a notepad.
The query sequences were also copied into the same file for further comparative studies.
List of URLs for the database searches and analysis
Database URL
Nucleotide sequences
GenBank
EMBL
DDBJ
Genome sequences
Entrez genomes
GeneCensus
COGs
Integrated databases
InterPro
Sequence retrieval system (SRS)
Entrez
Protein sequence (primary)
SWISS-PROT
PIR-International
Protein sequence (composite)

www.ncbi.nlm.nih.gov/Genbank
www.ebi.ac.uk/embl
www.ddbj.nig.ac.jp

www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Genome
bioinfo.mbb.yale.edu/genome
www.ncbi.nlm.nih.gov/COG

www.ebi.ac.uk/interpro
www.expasy.ch/srs5
www.ncbi.nlm.nih.gov/Entrez

www.expasy.ch/sprot/sprot-top.html
www.mips.biochem.mpg.de/proj/protseqdb

National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 18

OWL
NRDB
Protein sequence (secondary)
PROSITE
PRINTS
Pfam
Macromolecular structures
Protein Data Bank (PDB)
Nucleic Acids Database (NDB)
HIV Protease Database
ReLiBase
PDBsum
CATH
SCOP
FSSP
www.bioinf.man.ac.uk/dbbrowser/OWL
www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Protein

www.expasy.ch/prosite
www.bioinf.man.ac.uk/dbbrowser/PRINTS/PRINTS.html
www.sanger.ac.uk/Pfam/

www.rcsb.org/pdb
ndbserver.rutgers.edu/
www.ncifcrf.gov/CRYS/HIVdb/NEW_DATABASE
www2.ebi.ac.uk:8081/home.html
www.biochem.ucl.ac.uk/bsm/pdbsum
www.biochem.ucl.ac.uk/bsm/cath
scop.mrc-lmb.cam.ac.uk/scop
www2.embl-ebi.ac.uk/dali/fssp

References:
Birnboim, H.C. and Doly, J. 1979. A rapid alkaline extraction procedure for screening recombinant
plasmid DNA. Nucleic Acids Research. 7, 1513-1522.
Brown, T. A.1998. Gene Cloning: An Introduction, third edition, Stanley Thornes (Publishers) Ltd.
Chen, B.Y and Janes, H.W. 2002. PCR Cloning Protocols. In: Methods in Molecular Biology,
Second Edition (ed.) Walker, J.M. Humana press Inc. New Jersey. 439p.
Chomczynski, P. A. 1993. Reagent for the single-step simultaneous isolation of RNA, DNA and
proteins from cell and tissue samples. BioTechniques. 15: 532-537.
Chomczynski, P. and Mackey, K. 1995. Modification of the Tri Reagent procedure for isolation of
RNA from polysaccharide- and proteoglycan-rich sources. BioTechniques. 19: 924-945.
Chomczynski, P. and Sacchi, N. 1987. Single-step method of RNA isolation by acid guanidinium
thiocyanate-phenol-chloroform extraction. Annals of Biochemistry. 162: 156.
Clark, J. M. 1988. Novel non-templates nucleotide addition reactions catalyzed by prokaryotic and
eukaryotic DNA polymerases. Nucleic Acids Research. 16(20): 9677-9686.
Hall, T.A. 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program
for Windows 95/98/NT. Nucl Acids Symp Ser 41: 95-98
Hanahan D. 1983. Studies on transformation of Escherichia coli with plasmids. Journal of Molecular
Biology. 166:557-580.
Hanahan, D. 1985. Techniques for transformation of E.coli. In: DNA cloning. Vol. 1 (ed.)
D.M.Glover. Oxford, Washington DC, IRL Press.109-136pp.
Joe OConnell. 2002. RT-PCR Protocols In: Methods in Molecular biology, Vol. 193 (ed.) Walker,
J.M. Humana press Inc. New Jersey. 378p
Lo, Y. M. D.1998. Introduction to the polymerase chain reaction. Methods in Molecular Biology. 16:
3-10.
Pascali,

V. L., Pescarmona,M., Dobosz, M. and d'Aloja, E. 1990. Efficient, small scale electroelution
of high molecular weight DNA from agarose gels by a miniature vertical electrophoresis cell.
Electrophoresis. 12: 317-320.
Rapley, R. and Manning, R.L. 1998. RNA isolation and characterization protocols. In: Methods in
Molecular Biology, Vol. 86 (ed.) Walker, J.M., Humana Press, New Jersey, USA. 264 p.
Sambrook, J., Fritsch, E.F. and Maniatis, T. 1989. Molecular Cloning: A laboratory manual, Second
Edition. Cold Spring Harbor Laboratory Press, New York, USA.
Vogelstein,B., and Gillespie,D. 1979. Proceedings of National Academy of Science, USA.76: 615.

National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 19

Agarose Gel Electrophoresis



This is one of the routinely used techniques in molecular biology. This is used to separate DNA
fragments and to assess the quality and quantity of DNA.

Principle
The gel is made from agarose, a highly purified form of the polysaccharide that is used to make agar
plates on which bacteria is grown. The gel is immersed in buffer and the DNA fragments are loaded
onto a well at one end of the gel and made to move through the gel by the application of electric
current. DNA is negatively charged and so will move towards the positive anode. However, the
polysaccharide mix of the gel retards the DNA by a process of sieving, so that small fragments move
through faster and these fragments separate according to size.

The DNA is visualised by adding ethidium bromide (EtBr), a fluorescent molecule which intercalate
with the DNA bases, extending the length of linear and nicked circular DNA molecules and making
them more rigid. When EtBr is added, UV radiation at 254 nm is absorbed by the DNA and
transmitted to the bound dye. The energy is re-emitted at 590 nm in the red-orange region of the
spectrum. Ethidium bromide is a powerful mutagen and hence the gel should be handled carefully
with the gloves. The DNA bands can be visualised under UV and gel documentation appliances can
record the data.

Characteristic features of gel electrophoresis are:
1. The molecular weight of the DNA: The migration rate is inversely proportional to the molecular
weight
2. Agarose concentration: The migration rate is inversely proportional to the agarose concentration
3. Conformation of the DNA: Linear form travels slowest and the supercoiled form travels fastest
4. Applied voltage: Typical value - 5 volts per cm. The heat generated during electrophoresis is
dissipated by the buffer.
5. DNA being polyanionic at neutral pH, it migrates towards the anode.
6. The loading dye for DNA contains glycerol, which gives density to help the sample sink to the
bottom of the well and marker dyes Xylene Cyanol and bromophenol blue. Bromophenol blue moves
on par with 300-400 bp DNA and Xylene cyanol with 2-3 kb DNA.
7. The DNA is visualised by adding EtBr a fluorescent molecule that intercalates with the DNA
bases. To 0.8% agarose gel add EtBr to give 0.5 pg/ml concentration. UV radiation at 254 nm is
absorbed by the DNA and transmitted to the bound dye. The energy is re-emitted at 590 nm in the
red-orange region of the spectrum.
8. EtBr is a powerful mutagen. The dye is usually incorporated into the gel or conversely the gel is
stained after running by soaking in a solution of Et. Br.
9. The usual sensitivity of detection is 0.1 pg of DNA
10. The gel will be run along with a molecular weight marker, a wide range of which is commercially
available.
Protocol
1. Prepare 1% agarose gel in Tris-acetate EDTA buffer (IX TAE) containing EtBr
2. To 1 gm of agarose, add 100 ml of IX TAE. Heat until dissolved. Cool the gel to 50C and add
EtBr (0.5 pg/ml) before pouring into the gel apparatus.
3. Wash the gel casting tray and comb with water to remove dirt.
4. Place the apparatus on a level surface and check with the spirit level and adjust the level.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 20

5. Choose appropriate comb (commonly 12 slots) and fix into position.
6. Pour the gel onto the apparatus and allow it to cool and set.
7. After the gel has set firmly, pour little amount of buffer and remove the comb gently. Take care not
to drag the comb and break the gel.
8. Immerse the gel slowly into the gel tank. Add sufficient amount of IX TAE buffer. Connect the
electrode and check the current.
9. Note: Always check the electrical connections before loading the sample.
10. Load the samples into wells carefully.
11. Always load an aliquot of standard molecular weight marker along with the samples. It will help
in assessing the size of the DNA fragment by comparing with the electrophoretic mobility.

National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 21

Denaturing Polyacrylamide Gel Electrophoresis (PAGE) for nucleic acids


Introduction
Polyacrylamide gels are chemically cross-linked gels formed by the polymerization of
acrylamide with a cross-linking agent, usually N, N-methylene bisacrylamide (Bis). The
polymerization initiates by free radical formation usually carrying out with ammonium per sulfate as
the initiator and N, N, N, N-tetramethylene diamine (TEMED) as a catalyst. The length of the chain
may be determined by the concentration of acrylamide in the polymerization reaction. One molecule
of crosslinker includes for every 29 monomers of acrylamide. Denaturing gels polymerized in the
presence of an agent (urea or, less frequently, formamide) suppresses base pairing in nucleic acids.
Denatured DNA migrates through these gels at a rate that is almost completely independent of its
base composition and sequence. They are capable of resolving short single-stranded fragments of
DNA or RNA that differ in length by as little as one nucleotide. Such gels are uniquely suited for
nucleic acid sequence analysis, which is required, for instance, for all finger printing protocols.
During early days the gels used for DNA sequencing were thick by todays standards. The method
described here was used to cast in the Biorad unit (Sequi-Gen GT Sequencing Cell) and may be used
with appropriate modifications for other systems.
Materials Required:
Buffers and solutions
Acrylamide solution (45% w/v)
Acrylamide 434 g
N,N-methylenebisacrylamide 16 g
H
2
O to 600 ml
Heat the solution to 37C to dissolve the chemicals. Adjust the volume to 1 liter with distilled H
2
O.
Filter the solution through nitrocellulose filter and store in dark bottles at room temperature
Ammonium per sulfate (1.6% w/v) in H
2
O
KOH/Methanol solution: 5g KOH pellet in 100ml methanol. Store the solution at room
temperature in tightly capped bottle.
Repel Silane: contains dichlorodimethylsilane, eg.Sigmacote (from sigma), repelcote (from BDH) -
-(500 l Dimethyldichlorosilane mixed in 10 ml chlororform)
Bind Silane: 1 l of ethacryloxypropltti-methoxy-silane mixed in 497.5 ml ethanol and 2.5 ml of
0.5 % acetic acid
10X TBE electrophoresis buffer (1000ml)
108 g of Tris base
55 g of boric acid
40 ml of 0.5 M EDTA (pH 8.0)
EDTA (0.5 M, pH 8.0): Add 186.1g of disodium EDTA.2H2O to 800 ml of H2O. Stir vigorously
on magnetic stirrer. Adjust pH to 8.0 with NaOH (~20g of NaOH pellets) dispense into aliquots and
sterize by autoclaving (Note: The EDTA will not dissolve untill pH of solution is adjusted to 8.0 by
adding NaOH)
TEMED (N,N,N,N-tetramethylethylenediamine): commercially available must be stored in
tightly sealed bottles in 4 C.
Gel loading dye: 95 % formamide, 10 mM EDTA, pH 8, 0.09 % xylene cyanol FF and 0.09 %
bromophenol blue.
Urea (Solid) and water
Other requirements: Gel casting assembly (including glassplates), gloves (talc free), syringes,
water bath (at 55 C), petroleum jelly
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 22

Procedure:
Gel casting (Sequi-Gen GT Sequencing Cell):
1. Wash the plates, spacers in warm dilute dishwashing liquid and rinse thoroughly in tap water,
followed by distilled water. Rinse the plates with absolute ethanol and allow to dry. Plates
must be cleaned meticulously.
2. Treat the inner core plate or smaller or notched plate with silanizing solution (repel silane)
using tissue paper followed by even spreading (preferably in a fumehood), wipe the solution
through the entire surface of glassplates using kimwipes and allow it to air dry for 1-2 min.
Rinse the plates with deionized water then with ethanol and allow the plate to dry.
3. In the same way the outer glass plate was coated with bind silane, which help in fixing the gel
firmly to the plates during staining process.
4. Place the spacers in position over the inner core plate and place the outer glass plates over it.
5. The plates are positioned in vertical position and the lever-operated clamps are placed on
both sides and slide the lever clamps over sandwich.
6. Insert sandwich assembly into the cam-operated precision caster base. Lay assembly flat on
lab bench. Now prepare the gel solution.
Preparation of Acrilamide solution:
1. In a 250 ml conical flask prepare the acrilamide solution as per the table.
2. Combine all the reagents and then heat the solution in a waterbath 55 C waterbath for 3 min
to help dissolution of urea.
3. The solution was then filtered and was made up to 100 ml with water.
4. Remove the solution from the waterbath and allow it to cool to room temperature for 15 min,
swirl the mixture from time to time.
Table: Acrylamide solutions for denaturing gels
4% Gel 6% Gel 8% Gel 10% Gel
Acrylamide:bis
solution (45%)
8.9 ml 13.3 ml 17.8 ml 22.2 ml
10x TBE buffer 10 ml 10 ml 10 ml 10 ml
H
2
O 45.8 ml 41.4 ml 36.9 ml 32.5 ml
Urea 42 g 42 g 42 g 42 g
5. Transfer the solution to a 250 ml glass beaker, add 3.3 ml of freshly prepared 1.6%
ammonium per sulfate and swirl the gel solution gently to mix the reagents.
6. Add 50 l of TEMED to the gel solution and swirl the solution gently to mix.
7. Proceed with speed from here, carefully drawn the solution using the syringe.
8. The solution was pumped slowly using a syringe into the plate assembly through the injection
port provided at the bottom of the casting assembly.
9. Place the flat side of shark comb in position ~0.5 cm into the gel solution.
10. Allow to polymerise, the gel is ready for running after 1 hour.
11. Remove the shark comb tooth and reinsert the shark teeth side of the comb just into the gel to
form the wells between the teeth.
12. The plate assembly was shifted to the gel running compartment where, the upper and lower
tank units were filled with 1x TBE buffer.
13. In order to maintain the DNA in denatured condition during the gel run, a temperature probe
was attached to the plate and connected to the powerpack unit (PowerPac 3000, Biorad,
USA).
14. The gel was heated to 50C by pre-electrophoresis programmed at a constant temperature of
50C and variable parameters limits at 1000V/300mA for half an hour.
15. The samples were prepared by mixing 3.5 l of sample with 2 l of loading dye and heating
at 95C for 5 min and immediately placed on ice.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 23

16. The wells were flushed using a syringe and samples loaded from one end of the wells formed
between the shark teeth of the comb.
17. Electrophoresis was conducted at a constant temperature of 50C and maximum limits of
current at 1500 V and 300 mA in the power pack settings.
18. After the xylene cyanol dye reaches 2/3 of the gel length, the electrophoresis was terminated.
The glass plates were separated and the gel bound glass plate was trasferred to the staining
tray.


National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 24

Silver staining of DNA Polyacrylamide gels


Introduction
As a method, silver staining was originally developed to detect proteins separated by PAGE.
It was further optimized and applied to visualize other biological molecules, for example, nucleic
acids, lipopolysaccharides, glycoproteins and polysaccharides. These earlier protocols were, however,
comparatively tedious and offered limited sensitivity. The development of DNA amplification
fingerprinting (DAF) by Caetano-Anolles et al.,(1991) required a superior protocol to adequately
resolve and visualize complex DNA profiles. These requirements led directly to the codevelopment of
a successful combination of polyester-backed PAGE gels and DNA silver staining. The silver stain
protocol developed for DAF was described separately by Bassam et al., (1991) and has since gained
wide acceptance including commercialization (e.g., in the GenePrint STR systems and SILVER
SEQUENCE products from Promega Corporation, USA). Silver staining of DNA (and other
biological samples) has several advantages:
1. Image development and visualization is done under normal ambient light. Thus, the
procedure can be performed entirely at the laboratory bench without the need for darkroom or
UV illumination facilities.
2. The image is resolved with the best possible sensitivity and detail, because silver is deposited
directly on the molecules within the transparent gel matrix. Thus visualization is from the
primary source and does not suffer any degradation or blurring that can accompany secondary
imaging devices which involve fluorescence, autoradiography, focusing lenses, film
development or digital image processing.
3. Silver staining offers similar sensitivity to autoradiography, but avoids radioactive handling,
delays from development times and waste disposal issues.
4. As a preferred option, gels can even be dried onto a semi-rigid plastic backing film such as
GelBond PAG film, creating a permanent record of the original material. Air-dried gels are
resilient, preserving a concentrated and contrast-intensified image. They can also be stored
indefinitely without distortion, obviating the need and added expense of photography and
printing. In addition, the preserved gel is a molecular archive, as stained DNA bands are
real DNA that can be extracted, amplified, cloned and DNA-sequenced.
The protocol herein described was developed by Bassam and Gresshoff et al., (2007).
Chemicals required:
Fixer solution: Dilute glacial CH
3
COOH to 7.5% (vol/vol) with deionized water. Store at room
temperature (1825 C). Fixer solution is stable and can be made up in bulk. CAUTION: Solution is
slightly corrosive (household vinegar is commonly 5% CH
3
COOH). Avoid inhaling the vapor.
Formaldehyde solution (HCHO): Add 15 ml formaldehyde to 85 ml deionized water.
CRITICAL:This solution must be made up fresh as required. Ensure formaldehyde is stored at room
temperature, since cold storage causes inactivation. Before preparing, estimate the volume needed for
the number and size of gels that are to be stained.
Silver solution: Dissolve 0.1 g AgNO
3
in 100 ml deionized water. CRITICAL: This solution must
be made up fresh as required.
Sodium thiosulphate stock solution (Na
2
S
2
O
3
): Dissolve 0.2 g sodium thiosulphate in 50 ml
water to make a stock solution. CRITICAL: The stock must be prepared fresh weekly, hence keep it
as small as possible to avoid wastage.
Developer solution: Dissolve 3 g Na
2
CO
3
in 100 ml deionized water to make the developer
solution. To speed dissolving and avoid clumping, swirl the water vigorously and add the Na
2
CO
3

gradually. CRITICAL: This solution must be made up fresh as required and used at ~8 C. This is
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 25

most conveniently done by swirling the solution on an ice bath and monitoring the temperature just
before use. To raise the temperature, should it get too cold, swirl the flask under a hot water tap.
Developer stop solution: Dilute glacial CH
3
COOH to 7.5% (vol/vol) with deionized water. Store
refrigerated at 4 C. Solution is stable and can be made up in bulk. CAUTION Solution is slightly
corrosive. Avoid inhaling the vapor.

Note: Before preparing stock solutions, estimate the volume needed for the number and size of gels
that are to be stained.

Equipment Setup: Platform rocker: A simple and gentle rocking motion (once every 23 s) gives
the best results. Orbital motion is not recommended as it does not distribute reagent evenly across the
gel surface, seemingly because reagent swirls around the perimeter of the gel leaving the center
region relatively stagnant.

Procedure
Nucleic acid fixation
1. Choose a clean plastic staining tray that is larger than the gel by ~2 cm on all sides. Pour
sufficient fixer solution into the tray to cover the gel to a depth of ~5 mm.
2. Disassemble the PAGE rig carefully, and place the gel into the staining tray. If you are using
a polyester-backed gel, place it such that the gel side faces up in the tray.
3. Rock the staining tray continuously on a platform rocker. For typical mini-gels of ~1 mm
thickness, a minimum of 5 min fixation is required, but 10 min provides optimum contrast.
Longer times may be needed if thicker gels are used. This step may continue for up to ~30
min. CRITICAL STEP: Fixation is important for stain sensitivity. Its main function is to
immobilize the DNA molecules in the acrylamide gel matrix to avoid diffusion and
subsequent image blurring. It also removes and neutralizes unwanted chemicals such as urea
and buffer, which can interfere with staining.
Prepare fresh solutions
4. While the gel is fixing, prepare sufficient developer solution (as described in REAGENT
SETUP) to cover the gel in the staining tray to a depth of ~5 mm. CRITICAL STEP: This and
the solutions made up in the following steps are best prepared at this point in the protocol to
ensure freshness and for optimal time management. (The sodium thiosulphate stock and
developer stop solutions should already be pre-prepared and ready to use at this point.)
5. Add sodium thiosulphate stock solution (prepared as described in REAGENT SETUP) at the
rate of 50 ml per 100 ml to the developer solution.
6. Cool the developer solution by putting it into a 4 C refrigerator.
7. Prepare sufficient formaldehyde solution (as described in REAGENT SETUP) to cover the
gel in the staining tray to a depth of ~5 mm.
8. Prepare sufficient silver solution (as described in REAGENT SETUP) to cover the gel in the
staining tray to a depth of ~5 mm.
Gel washing
9. Following fixation, carefully decant the solution, taking care not to damage the gel or touch
the gel surface.
10. To wash the gel, pour sufficient deionized water into the staining tray to cover the gel to a
depth of ~5 mm.
11. Rock the staining tray continuously on a platform rocker for 2 min. Longer times may be
needed if gels thicker than ~1 mm are used. If the gel is washed for too long (over ~20 min),
then staining may be compromised, and fainter bands will result.
12. At the end of the wash, carefully decant the wash solution as described in Step 9.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 26

13. Repeat the wash steps two times more for a total of three washes in deionized water.
CRITICAL STEP: Washing the gel is important. It removes acid and other trace substances
that interfere with staining, and provides a clear, blemish-free background to the final stain.
Formaldehyde pre-treatment
14. Add sufficient formaldehyde solution to cover the gel in the staining tray to a depth of ~5
mm. Gently rock the staining tray continuously on a platform rocker. For typical mini-gels of
~1 mm thickness, a minimum of 5 min formaldehyde pre-treatment is required while ~10 min
provides optimum contrast. Longer times may be needed if thicker gels are used. This step
may continue for up to ~30 min.
CRITICAL STEP: Formaldehyde pre-treatment is important for stain sensitivity and
maximum image contrast.
15. Following the formaldehyde pre-treatment, carefully decant the solution, taking care not to
damage the gel or touch the gel surface.
Silver impregnation
16. Add sufficient silver solution to cover the gel in the staining tray to a depth of ~5 mm.
17. Gently rock the staining tray continuously on a platform rocker. For typical mini-gels of ~1
mm thickness, a 20 min impregnation time is usually optimal. CRITICAL STEP: The
recommended silver concentration cannot be reduced without affecting sensitivity and
contrast. A careful examination of silver impregnation times showed that optimal staining
was achieved after ~20 min. However, as little as 10 min is sufficient for high-quality
staining without significant loss of sensitivity. Impregnation times can be increased up to ~60
min, but greater than ~90 min can cause severe image loss.
18. Following silver impregnation, carefully decant the solution, taking care not to damage the
gel or touch the gel surface. ! CAUTION: The silver solution is toxic and should be disposed
of with care. Avoid spilling the solution, as it will permanently stain most surfaces.
19. Briefly rinse residual silver solution from the surface of the gel by rinsing with ~100 ml of
deionized water for 510 s. Do not rinse the gel longer than ~15 s, as this step removes silver
from the gel.
Image development
20. Check whether the developer is cold (it should be between 4 and 10 C). Add sufficient
developer solution to cover the gel in the staining tray to a depth of ~5 mm. Agitate the
staining tray throughout image development so the developer solution is not stagnant. Image
development begins as soon as the developer solution is added. The developer solution is kept
cold to control the rate of image development, since development is usually too fast to control
if done at temperatures above 10 C. Image development typically takes about 3 min
depending on gel thickness, the reagents used and the temperature of the reagents.
CRITICAL STEP: Decreasing Na
2
CO
3
concentration below the recommended levels causes
higher background staining and poor image contrast. Poor staining can also result from the
use of low quality or old (stale) reagents.
Stopping the reaction
21. Decant the developer solution carefully, avoiding damage to the gel or touching the gel
surface.
22. Check whether the developer stop solution is cold (it should be stored refrigerated at 4 C).
Add sufficient developer stop solution to cover the gel in the staining tray to a depth of ~5
mm. As an alternative, developer stop solution kept at room temperature can be used for thin
gels (<1 mm in thickness). However, this alternative requires some practice as the image will
continue to develop for several seconds after the developer stop solution is added.
23. Allow the gel to sit in developer stop solution for 510 min. CRITICAL STEP: The
developer stop solution contains 7.5% CH
3
COOH. Higher CH
3
COOH concentrations can
cause image fading, and should be avoided. Since development occurs quickly, it is best to
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 27

stop the reaction as abruptly as possible to avoid accidental overdevelopment. For this reason,
the developer stop solution is chilled to 4 C so that it acts as quickly as possible-the low
temperature slows the kinetics of development and allows time for the acid to take effect.
24. Decant the developer stop solution and rinse the gel with deionized water.
25. If desired, photograph the gel. Dried gels are robust and can be safely handled. If properly
stained, the image will not fade or darken and provides a permanent record of the experiment.
References:
Bassam, B.J. & Bentley, S. (1995) Electrophoresis of polyester-backed polyacrylamide gels.
Biotechniques 19, 568573.
Bassam, B.J. and Gresshoff, P.M. (2007) Silver staining DNA in polyacrylamide gels. Nature-
Protocols. 2(11), 2649-2654
Bassam, B.J., Caetano-Anolles, G. & Gresshoff, P.M. (1991) Fast and sensitive silver staining of
DNA in polyacrylamide gels. Anal. Biochem. 196, 8083.
Sambrook, J., Russell, D.W. Molecular Cloning: A Laboratory Manual 3rd Edition. Cold Spring
Harbor, NY: Cold Spring Harbor Laboratory Press, 2001


National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 28

NBS Profiling
Ben Vosman
Plant Research International, Wageningen, Netherlands
Introduction
NBS profiling is a technique for DNA fingerprinting and expression profiling of R-genes based on
conserved motifs in the nucleotide binding domain of resistance genes in plants.
The technique involves three steps:
1. Restriction enzyme digest of (c)DNA and the ligation of adapters
2. Selective amplification of fragments using a (degenerated) primer for the conserved
domains.
3. Gel analysis of the amplified fragments
Depending on the motif and primer, 30-150 fragments can be amplified in a single PCR reaction of
which up to 95 % contain the targeted motif. Polymorphisms are based on variations in the region of
conserved domain (including absence presence of genes), mutations in the restriction sites used and
indels in the sequence between the motif-specific primer annealing site and the restriction site.
Any changes made to this protocol (including use of polymerase, minute changes in the primers,
RL mixes, PCR cycling conditions, use of PCR machines) may affect the NBS profile produced
and should be done with extreme caution and the appropriate controls.
Starting Material
1.1 Quality check and estimation of DNA Yield.
If one starts with similar amounts of plant material using the same procedure the yield will also be
similar. Dissolve DNA to an expected concentration of 200 ng/ul in TE. Load approximately 50 ngr
on a agarose gel. Using a dilution series of known quantity (add RNase to the loading buffer) estimate
the DNA concentration and dilute the DNA to a final concentration of 50 ngr/ul. The quality of the
DNA is one of the most important determinants of the quality of the NBS profile. The highest grade
of DNA quality should be pursued.
2.1 Restriction Digestion and Adaptor Ligation
In this step the DNA is digested with a restriction enzyme with a four base recognition site and
blocked adapters are ligated to the ends. The blocked adapters consist of a long oligo with a sequence
similar to the adapter primer and a short oligo that is blocked by an amino group at the 3 end. The
amino groups blocks elongation by Taq polymerase. At the start of the PCR the adapter primer can
not anneal, only when a domain specific primer anneals and is elongated the annealing site for the
adapter primer is generated. This prevents the amplification of adapter-adapter fragments.
Adapter sequences:
5 A C T C G A T T C T C A A C C C G A A A G T A T A G A T C C C A 3 (long arm)
5P T G G G A T C T A T A C T T 3-NH2 (short arm)
Prepare a mix of reagents shown in blue (always prepare approximately 10% more than needed). It is
best to pipet reagents in the order listed and to mix the solution before the enzymes are added and
after all components are added.
Components l per reaction
5 xRL+ (AFLP buffer) 12
adapter (adapted to Restriction enzyme 3
H
2
O 29
ATP 10 mM 6
Restriction enzyme (10 Units/ul) 1
Ligase (1 Unit/ul) (for blunt Enzymes: use high
concentrate ligase
1
DNA 4
Incubate for 3 hours at 37
0
C (preferable in PCR block) inactivate enzymes for 15 minutes at 65
0
C
and store at 4
o
C or at 20
o
C.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 29

Add 60 ul H
2
O to the Restriction Ligation mixture!
Note: experiments have shown that variation of the amount of DNA from 50 and 500 ng does not
result in different patterns; we recommend using 200 ng of DNA. Other experiments have shown that
dilution of the restriction ligation mixture is critical.
3. FIRST AMPLIFICATION ROUND
3.1. PCR methods
In this step the domain specific primer is annealed and elongated by the Taq polymerase resulting in
an annealing site for the adaptor primer not present previously. This in combination with the hot start
Taq ensures high specificity. Although degenerate primers have a reputation of giving variable
results, experiments show high reproducibility of this PCR (comparable to AFLP).
Prepare a mix of reagents shown in blue (always prepare approximately 10% more than needed). It is
best to pipet reagents in the order listed and to mix the solution before the enzymes are added and
after all components are added.
Adapter primer sequence:
5 G T T T A C T C G A T T C T C A A C C C G A A A G 3
Components l per reaction
PCR buffer (with 15 mM
MgCl
2
)
2.5
DNTP mix (5mM) 1
HotstarTaq polymerase
(Qiagen)
0.08
NBS specific primer 10 pMol/ul 2
Adapter primer (10 pMol/ul 2
H
2
O 12.42
Template 5
PCR conditions:
15 min 95 C
30 sec 95 C, 1.40 min at 55-60C, depending on motif-specific primer (see below),
2 min 72 C 30-35 cycles
20 min 72 C
Hold at 4
0
C
Note: HotstarTaq polymerase is only active after an incubation step of 15 minutes at 95
o
C and
therefore prevents non-specific amplification during pipetting.
Annealing temperatures of the common primers:
NBS2, NBS3: 60C
NBS1, NBS5, NBS9: 55C
3.2 VERIFICATION OF PCR AMPLIFICATION
Load 15 l of the PCR product on 1% agarose which should result in a smear with several distinct
fragments in the size range of 100-1000 bp. Patterns vary with the primer used and the DNA source.
3.3 DILUTION OF THE AMPLIFIED FRAGMENTS
Add 90 ul of H
2
O to the remainder of the sample.
4. AMPLIFICATION WITH LABELLED PRIMER
4.1 PRIMER LABELLING
Prepare a mix of reagents shown in blue (always prepare approximately 10% more than needed). It is
best to pipet reagents in the order listed and to mix the solution before the enzymes are added and
after all components are added.
Components l per reaction
T4-forward buffer 5x 0.1
Distilled water 0.19
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 30

Domain specific primer ((10
pmol/l)
0.1
T4-polynucleotide kinase (10
U/l)
0.01
| -
33
P|ATP 0.1
incubate mixture in 37 C waterbath for 1 to 16 hours.
Optional: Inactivate kinase by heating reaction mixture to 70 C for 10 minutes
4.2 PCR METHODS
Prepare a mix of reagents shown in blue (always prepare approximately 10% more than needed). It is
best to pipet reagents in the order listed and to mix the solution before the enzymes are added and
after all components are added.
Components l per reaction
PCR buffer 2
DNTP mix (5mM) 0.8
Taq polymerase 0.08
Labelled motif-specific primer 0.5
Adaptor primer (10 pMol/ul) 0.2
H
2
O 11.42
Diluted mixture first PCR 5
PCR conditions: (Perkin Elmer GeneAmp
TM
PCR system 9600)
30 sec 95 C,1.40 min 55-60C (depending on motif-specific primer), 2 min 72 C 30-35 cycles
20 min 72 C
Hold at 4
o
C
5. POLYACRYLAMIDE GEL ELCTROPHORESIS
5.1. SAMPLE PREPARATION
1. Add an equal volume of loading buffer (98% formamide, 10mM EDTA pH8.0, Bromophenol
blue and Xylene cyanol), incubate samples for 3 min to 95 C and cool samples on ice before
loading on the PAA-gel.
2. DNA samples are analyzed on a 6% polyacrylamide gel (SequaGel-6, Ready-To-Use 6%
Sequencing Gel Solution, National Diagnostics)
3. Electrophorese unit: Bio Rad sequi-Gen II(38x50 cm)
4. Gel loading using a fixed order and a tracking system for loading of samples in order to
prevent loading errors. Empty wells in the microtiter plate scheme, provide a check for
correct loading.
5. Run the samples generally for at least 3 hours (depends on size of DNA fragments) at 110 W.
6. Fix gel on Whatman 3MM paper and dry.
7. Cover dry gel with film (Kodak X-OMAT AR, 35x43 cm) and store gel and film in light-tight
cassette. Length of exposure time depends on the amount of radioactivity of the image.
8. Develop film.
STOCKS AND SOLUTIONS:
5xRL+ buffer: 50 mM Tris.HAC pH 7.5
50 mM MgAc
250 mM KAc
25 mM DTT
250 ng/ul BSA
TE buffer: 1 ml 1M Tris.HCl (pH 8.0)
20 l 0.5 M EDTA (pH 8.0)
Add MilliQ H
2
O up to 100 ml.
Adapter synthesis:
Labo

blunt
5 A C

Adapt
Add 1
Incub













Refer
Van d
Van d
Calen
Brugm
Wang
Jacob






Natio
ratory Manu
adapter:
C T C G A T

ter synthesis:
1.25 nmol eac
bate mixture f
rences:
der Linden CG
Vosman (
Theor. App
der Linden C
molecular
process. F
volume 14
nge F, van de
(2005) Res
major gene
mans B, Wou
Vossen E (
using NBS
g, M., RG Va
for plant s
Evolution
bs, M.M.J., B
Berg (2010
genetic ma
onal training o
ual
T C T C A A
3

ch of top- and
for 3 min at 90
G, D.C.A.E. W
2004) Efficie
pl. Genet. 109
CG, Smulder
evolution. I
.T. Bakker, L
43. Koeltz, Ko
er Linden CG
sistance gene
es and QTL f
uters D, van
(2008) Genet
S profiling. Th
an den Berg, G
systematics: a
Plant Syst Ev
. Vosman, V
0) A novel ap
ap. Theor. Ap
on Allele Mi

A C C C G A A
NH
2
T T C A
d bottom stran
0 C and cool
Wouters, Vira
ent targeting
9:384-393.
rs MJM & V
In: Plant spe
L.W. Chatrou
oenigstein/Ge
G, van de We
e analogues id
for disease res
Os H, Hutte
tic mapping a
heor Appl Ge
GC Van der L
a first study i
vol 276:137
V.G.A.A. Vle
pproach to loc
ppl. Genet. 12
ining 12
th
- 2

A A G T A T
A T A T C T
nd and adjust
l down slowly
ag Mihalka, E
of plant dis
Vosman B. (2
ecies-level sy
, B. Gravend
ermany. Page
eg E, Schoute
dentified thro
sistance in app
en R, van der
and transcripti
enet 117:1379
Linden & B. V
in tuber-beari
148
eeshouwers, R
cate Phytopht
20:785-796.


25
th
Sept, 201

T A G A T C C
A G G G T
volume to 75
y to room tem
Elena .Z. Koc
sease resistan
2005) Motif-
ystematics: ne
deel & P.B. Pe
291-303.
en HJ, van A
ough the NBS
ple. Theor Ap
r Linden G,
ion analyses o
91388
Vosman (200
ing Solanum
R.G.F. Visse
thora infestan
11, IISR, Cali

C C A 3
5-P
5 l.
mperature.
From Bru
chieva, M. J.
nce loci usin
-directed prof
ew perspecti
elser (eds.) R
Arkel G, Den
S-profiling me
ppl Genet. 11
Visser R, va
of resistance
08) The utility
species. Plan
er, B. Henken
ns resistance g
icut


ugmans et al.,
M. Smulders
ng NBS prof
filing: a glan
ives on patte
Regnum Vege
nanc C, Dur
ethod map clo
0: 660-668.
an Eck H, va
gene loci in p
y of NBS pro
nt Systematic
n & R.G. Va
genes on the p
31
2008
s & B.
filing.
nce at
ern &
etabile
el CE
ose to
an der
potato
ofiling
cs and
an den
potato
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 32

EcoTILLING
K.Johnson George, and I.P. Vijesh Kumar
Indian Institute of Spices Research, Marikunnu P.O., Calicut- 673012.

TILLING and EcoTILLING applications were originally designed to be used on the LICOR DNA
Analyzer, but have moved to numerous other platforms which do not require the use of dye labeled
primers. The benefits are an inexpensive platform for reverse genetics and rapid SNP discovery,
which still allows pooling of samples to increase throughput and reduce discovery bias.

EcoTILLING studies are primarily concerned with identifying informative SNPs for population
genetics, forensics, conservation and resource management work. Inclusion of too few individuals in
the discovery panel can introduce ascertainment bias. EcoTILLING allows hundreds or thousands of
individuals to be included in the discovery panel by pooling, which reduces ascertainment bias, and
allows for the discovery of the most informative SNPs for the study at hand.

During the training, we will be using Transgenomic SURVEYOR Mutation Detection Kits for
TILLING. The kit includes a mismatch-specific DNA endonuclease to scan for known and unknown
mutations and polymorphisms in heteroduplex DNA. SURVEYOR Nuclease, the key component of
the kits, is an endonuclease that cleaves DNA with high specificity at sites of base-substitution
mismatch and other distortions. The SURVEYOR Mutation Detection Kit for Standard Gel
Electrophoresis has been designed to cleave unlabeled DNA fragments at mismatched sites for
subsequent analysis by agarose gel electrophoresis or polyacrylamide gel electrophoresis (PAGE).
DNA 200 to 4,000 bp long can be analyzed using manual agarose gel electrophoresis while smaller
fragments (<1,000 bp) can be analyzed using manual polyacrylamide gel electrophoresis (PAGE).

Kit Components
1). SURVEYOR Nuclease S 2). SURVEYOR Enhancer S 3).0.15 M MgCl2 Solution 0.25 mL
4).Stop Solution 0.25 mL (Store all components at 20 C)
DNA samples from black pepper (Piper nigrum) accessions will be used in the experiments.

Step 1. PCR amplify your target fragment.
This step is criticalto the success of the surveyor nuclease digestion.
Ensure the following:
Your PCR yield is sufficiently high (>25 ng/L).
Your PCR product has low background (preferably a single species of the correct size).
Your PCR product is essentially free of primer-dimer artifacts.
It is imperative that a single PCR product be produced for efficient TILLING.

Pooling several individuals in each PCR
Pooling of samples is advantageous for several reasons:
(1) More potential heteroduplexes may be seen
(2) the number of individuals that can be surveyed at a time is increased
(3) pooling can give an indication of the frequency of the SNP site in various populations prior to
investing time and money in high-throughput genotyping.
How many samples can be pooled?
Up to 5 individual samples (~50ng/ul), 1ul each, can be pooled into a PCR reaction.

For a single 25uL PCR reaction, add:
DNA 50 ng
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 33

10X PCR Buffer 2.5 uL
50mM MgCl2 1.5 uL
10mM dNTPs 1.3 uL
SharkaTAQ 1.0 uL
H2O x uL
Total: 25.0 uL
Cycling:
95C for 3 minutes
33 cycles of:
95C for 30 seconds
57C for 30 seconds *
72C for 2 minutes**
15C soak
Depending on the primer* and expected PCR product**

Step 2. Create heteroduplexes of the PCR products:
1. Mix equal amounts of PCR products in a 0.2-mL tube (If interested in creating pooled PCR
product, individual products also may be used). For efficient annealing final volume should
be at least 10 L. The concentration of samples should be in the range of 25 to 80 ng/L and
ideally 50 ng/L. About 200 - 400 ng of hybridized DNA is recommended for treatment with
SURVEYOR Nuclease S, so that each tube should contain at least 200 ng total DNA.
2. Place the tube in a thermocycler and run the following program:
95 C 10 min
95 C to 85 C (-2.0 C/s)
85 C 1 min
85 C to 75 C (-0.3 C/s)
75 C 1 min
75 C to 65 C (-0.3 C/s)
65 C 1 min
65 C to 55 C (-0.3 C/s)
55 C 1 min
55 C to 45 C (-0.3 C/s)
45 C 1 min
45 C to 35 C (-0.3 C/s)
35 C 1 min
35 C to 25 C (-0.3 C/s)
25 C 1 min
4 C Hold.
The product is now ready to be treated with SURVEYOR Nuclease for heteroduplex analysis.
Continue with Step 3 Treatment with SURVEYOR Nuclease.

Step 3. Cleave heteroduplexes:
1. For each digestion, add the following components in the order shown to a nuclease-free 0.2-
mL tube (kept on ice):
200 to 400 ng (V = 8 to 40 L) hybridized DNA
1/10th V L 0.15 M MgCl2 Solution
1 L SURVEYOR Enhancer S
1 L SURVEYOR Nuclease S
2. Mix by vortexing gently, by agitation or by aspiration/expulsion in a pipette tip using a
micro-pipetter.
3. Incubate at 42 C for 60 min
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 34

4. Add 1/10th volume of Stop Solution and mix. Store the digestion products at 20 C if not
analyzed immediately.

Step 4. Separate cleavage products:
Samples can be separated on various platforms. Amplified DNA fragments in the size range of 200 to
4,000 bp are most effectively resolved from potential digestion products on agarose gels (2%).












































Labo

and d
transc
strikin
Trans
organ
regula
other
regula
PCR
This p
nested
rigoro

Gene
Natio
ratory Manu
Ind
The DNA
directs RNA p
cription. Iden
ng interest f
scription fact
nisms. The f
atory sites, in
factors to re
ate the transcr

R Based Gen
protocol was
d PCR with a
ously the requ
eral outline
onal training o
ual
dian Institute
region, usual
polymerase to
ntification of
for biologists
tors, which a
factors can b
n the promote
egulate the tra
ription of gen
nome Walkin
adapted from
a touch down
uirements for
on Allele Mi

Prom
D. Pras
of Spices Res
lly upstream
o the correct
f transcription
s since these
are proteins,
bind to speci
er region of p
anscription o
nes.
ng
m Siebert et al
program. Thi
the design of
ining 12
th
- 2

moter Min
sath and P.R.
search, Marik

to the coding
transcriptiona
nal regulator
e elements g
play a maj
ific sites, ter
particular gen
f a gene. Tra
l. (1995) and
is method is v
f primers and
25
th
Sept, 201

ning
. Rahul
kunnu P.O., C
g sequence of
al start site a
ry elements w
govern the r
jor role in
rmed transcr
nes and inter
anscription fa
www.clontec
very prone to
the touch-dow

11, IISR, Cali

Calicut- 6730
f a gene or op
and thus perm
within promo
regulation of
gene regulat
ription factor
ract with RN
actors are sai
ch.com. The p
o artefacts so y
wn PCR prog
icut

12.
peron, which
mits the initati
oter regions
f gene expre
tion of euka
r binding sit
NA polymeras
id to coopera
protocol invo
you need to f
gram.
35
binds
ion of
is of
ession.
aryotic
tes or
se and
atively
lves a
follow
Labo

Diges
1. Cu
to ma

2. Inc
3. Ad
lay
4. Ad
80
5. Spi
sup
6. Res
7. Run

Adap
1. To
25mM
2 and
2. Use
3. Inc
4. Sto
5. Ad
for m




Natio
ratory Manu
stion of gen
t with a 6bp b
aximize gene c
2.5 g of D
5 l of rest
10l 10 x r
10 l 1mgm
dH
2
O to a
cubate for 5h a
dd 100l 25:24
yer to fresh tu
dd 2.5 volume
o
C for 1 hr.
in in microfu
pernate.
suspend pelle
n 1 l on a 1%
pter Ligatio
make the ada
M final conce
d then let coo
ed a concentr
5 l of dig
1 l 10X li
1 l T4 lig
2.4 l adap
0.6 l dH
2
cubate overnig
op the ligase a
dd 90 ul TE (1
ore than a 20
onal training o
ual
nomic DNA
blunt end cut
coverage. Set
DNA
triction enzym
restriction bu
ml-1 BSA (if
final reaction
at the appropr
4:1 phenol:ch
ube.
es ethanol and
uge for 10 at
et in 20l dH
2
% gel, 30V fo
on
aptor, mix the
entration [e.g.
l to room tem
rated T4 DNA
gested DNA
igase buffer
gase
ptor (25mM)
0
ght at 16
o
C
activity by inc
10mM Tris pH
0 PCR reactio
on Allele Mi

tter. Can eithe
t up reaction:
me
uffer
f not already i
n volume of 1
riate tempera
hloroform: IA
d 0.1 volume
t full speed.
2
O.
or 5 hours. A
e long and sh
20l long pr
mp.
A ligase (5x)
cubating the r
H7.5, 1mM E
ons.
ining 12
th
- 2

er use one en
in 10 x buffer
00l.
ture.
AA. Vortex an
3M NaOAc p
Wash pellet
good smear s
hort primers (s
rimer (50mM
, (Biolabs, M
reactions at 70
EDTA). This g
25
th
Sept, 201

nzyme or set u
r)
nd spin for 5
pH4.5. Precip
with 70% eth
should be seen
see below) in
M) + 20l sho
M0202T) and s
0
o
C for 5
gives you 100
11, IISR, Cali

up 4 different
in microfuge
pitate at 20
o
C
hanol, spin a
n.
n the right con
ort (50mM)].
set up the foll
0l of a librar
icut

t enzyme reac
e. Remove aq
C overnight o
again and dra
ncentration to
Place at 100
o
lowing reactio
ry that can be
36
ctions
queous
or at
ain off
o get a
o
C for

on:
e used
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 37

Touch down PCR
Two nested reactions need to be carried out using the AP1 and AP2 primers illustrated on the
previous page and two gene specific primers. The gene specific primers need to be designed
(www.clontech.com)

1. Set up following reaction using an expand or other enhanced polymerase
1 l adapter ligated library
5 l 10 X PCR buffer
4 l 25 mM MgCl2
1 l 10 mM dNTPs
1 l AP1 Primer (10M)
1 l Gene specific primer 1 (10M)
0.5 l DNA Polymerase
35.5 l dH
2
O
50 l TOTAL
2. Cycle as follows:
94
o
C (25s), 72
o
C (3) X 7
94
o
C (25s), 67
o
C (3) X 32
67
o
C (7) X 1
cool to 4
o
C.

3. Analyze 8l of the reaction on a 1.5% agarose gel. You should observe banding patterns however
there may be some smearing.
4. Dilute 1l of each primary PCR into 49 l dH
2
O.
5. Set up nested reaction mix:
1 l diluted primary PCR reaction
5 l 10 X PCR buffer
4 l 25 mM MgCl2
1 l 10 mM dNTPs
1 l AP2 Primer (10M)
1 l Gene specific primer 2 (10M)
0.5 l DNA polymerase
35.5 l dH
2
O
50 l TOTAL
6. Cycle as follows:
94
o
C (25s), 72
o
C (3) X 5
94
o
C (25s), 67
o
C (3) X 20
67
o
C (7) X 1; cool to 4
o
C.

7. Analyze 5l of the reaction on a 1.5% agarose gel. You should observe distinct banding patterns.
The remainder of the PCR reaction can then be used to clone and sequence the band of interest.

Selected references:
Siebert et al. 1995. An improved PCR method for walking in uncloned genomic DNA , Nucleic acid
research, 23(6):1087-1088.
www.clontech.com





National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 38

TOOLS FOR GENETIC DIVERSITY ANALYSIS
Rajesh M.K
.*, 1
and Jayasekhar S.
2
1
Division of Crop Improvement,
2
Division of Social Sciences
Central Plantation Crops Research Institute, Kasaragod 671124, Kerala
(*E-mail: mkraju_cpcri@yahoo.com)
Introduction
Plant genetic resources constitute the chief component of agro-biodiversity and comprise of
land races, modern cultivars and obsolete varieties, breeding lines and genetic stocks and wild
species. They provide the basic materials to the plant breeders to utilize genetic variability for the
development of high yielding cultivars with a broad genetic base. The utilization of these genetic
resources, however, depends upon their efficient and adequate characterization and evaluation, which
in turn entails efficient characterization standards and appropriate strategies.
Analysis of trait data generated from characterization and evaluation of the genetic resources
is used to understand and use diversity. Currently, a large number of distance measures are available
for analyzing similarity/dissimilarity among accessions based on different traits representing different
types of variables. The selection of the most appropriate distance measure for each trait is the
prerequisite for diversity analysis studies. One of the approaches is to form clusters where accessions
between clusters would be more diverse than the accessions within a cluster. The clustering
algorithms require a distance/similarity matrix between the accessions which can be calculated
depending upon the nature or type of traits such as morphological and agronomic traits and/or
molecular markers.
The availability of cost-efficient, large scale genotyping techniques has greatly facilitated the
assessment of genetic diversity within populations. Various computational tools have also been
developed concurrently to analyze the genetic data derived from the genotyping experiments. In this
review, the basics of population genetics, important parameters in genetic diversity analysis and the
most widely used computer programmes in population genetic studies have been described.

Basics of population genetics
Variation in alleles allows organisms to adapt to ever-changing environments. Alleles are
different forms of the same gene that are expressed as different phenotypes. All of the alleles shared
by all of the individuals in a population make up the population's gene pool. In diploid organisms,
every gene is represented by two alleles, one inherited from each parent. The pair of alleles may differ
from one another, in which case it is said that the individual is "heterozygous" for that gene. If the
two alleles are identical, it is said that the individual is "homozygous" for that gene.
Population genetics is the study of allele frequency distribution and change under the
influence of the four main evolutionary processes: natural selection, genetic drift, mutation and gene
flow. It also takes into account the factors of population subdivision and population structure and
attempts to explain such phenomena as adaptation and speciation.
Based on Mendelian genetics, it is possible to predict the probability of the appearance of a
particular allele in an offspring when the alleles of each parent are known. Similar predictions can be
made about the frequencies of alleles in the next generation of an entire population. By comparing the
predicted or "expected" frequencies with the actual or "observed" frequencies in a real population,
one can infer a number of possible external factors that may be influencing the genetic structure of the
population (such as inbreeding or selection).
A population is defined as a group of interbreeding individuals that exist together at the same
time. A population may either be considered as a single unit or it can be subdivided into smaller units.
Subdivisions of a population may be the result of ecological factors or behavioural factors. If a
population is subdivided, the genetic links among its parts may differ, depending on the real degree of
gene flow taking place.

National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 39

A population is considered structured if:
(i) Genetic drift is occurring in some of its subpopulations,
(ii) Migration does not happen uniformly throughout the population, or
(iii) Mating is not random throughout the population.
A populations structure affects the extent of genetic variation and its patterns of distribution.

Genetic drift
Genetic drift refers to fluctuations in allele frequencies that occur by chance (particularly in
small populations) as a result of random sampling among gametes, i.e. random changes in gene
frequency which are not due to selection, gene mutation or migration. Genetic drift decreases
diversity within a population because it tends to cause the loss of rare alleles, reducing the overall
number of alleles. Because of genetic drift, small, isolated populations often have unusual frequencies
of a few alleles.

Gene flow
Gene flow is the passage and establishment of genes typical of one population in the gene
pool of another by natural or artificial hybridization and backcrossing. Non-random mating occurs
when individuals those are more closely (inbreeding) or less closely related mate more often than
would be expected by chance for the population. Self-pollination or inbreeding is similar to mating
between relatives. It increases the homozygosity of a population and its effect is generalized for all
alleles. Inbreeding per se does not change the allelic frequencies but, over time, it leads to
homozygosity by slowly increasing the two homozygous classes.
Mutations could lead to occurrence of new alleles, which may be favourable or deleterious to
the individuals ability to survive. If changes are advantageous, then the new alleles will tend to
prevail by being selected in the population. The effect of selection on diversity may be:
(i) Directional, where it decreases diversity;
(ii) Balancing, where it increases diversity. Heterozygotes have the highest fitness, so selection
favours the maintenance of multiple alleles; and
(iii) Frequency dependent, where it increases diversity. Fitness is a function of allele or
genotype frequency and changes over time.

Migration
Migration implies not only the movement of individuals into new populations but that this
movement introduces new alleles into the population (gene flow). Changes in gene frequencies will
occur through migration either because more copies of an allele already present will be brought in or
because a new allele arrives. Various factors which affect migration in crop species include breeding
system, sympatry with wild and/or weedy relatives, pollinators, and seed dispersal. The immediate
effect of migration is to increase a populations genetic variability and, as such, helps increase the
possibilities of that population to withstand environmental changes. Migration also helps blend
populations and prevent their divergence.

Hardy-Weinberg Principle
The foundation for population genetics was laid in 1908, when Godfrey Hardy and Wilhelm
Weinberg independently published which is known as the Hardy-Weinberg Equilibrium or Hardy-
Weinberg Principle, which states: "In a large, randomly breeding (diploid) population, allelic
frequencies will remain the same from generation to generation; assuming no unbalanced mutation,
gene migration, selection or genetic drift." When a population meets all of the Hardy-Weinberg
conditions, it is said to be in Hardy-Weinberg equilibrium. The "equilibrium" is a simple prediction of
genotype frequencies in any given generation, and the observation that the genotype frequencies are
expected to remain constant from generation to generation as long as several simple assumptions are
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 40

met. This description of stasis provides a counterpoint to studies of how populations change over
time.

Testing for Hardy-Weinberg Equilibrium
The deviation of a population from Hardy-Weinberg equilibrium is an indication of the intensity of
external factors and can be determined by a statistical formula called a chi-square, which is used to
compare observed versus expected outcomes. The statistical test follows this formula:
HWT
=


(O
i-
E
i
)
2
/E
i

Where HWT = Statistical test for Hardy-Weinberg Equilibrium; Oi = Observed frequencies and Ei =
Expected frequencies
If X
2
cal
X
2
tab
, then H
0
hypothesis is accepted and it follows that allele frequencies for loci in a given
population are HWT equilibrium. If X
2
cal
X
2
cal
, then H
0
hypothesis is rejected.
Important parameters in genetic diversity analysis
(A) Polymorphism or rate of polymorphism: A polymorphic gene is usually defined as one for
which the most common alleles has a frequency of less than 0.95.
Pj = q 0.95
Where, Pj = rate of polymorphism and q = allele frequency

For a correct estimation of genetic distance, the genetic loci use in genetic distance analysis
should be informative, i.e., they should display sufficient polymorphism. The limit of allele
frequency, which is set at 0.95, is arbitrary, its objective being to help identify those genes in which
allelic variation is common. Rare alleles are defined as those with frequencies of less than 0.005.
This index is best applied with codominant markers. It can also be used with dominant
markers too, but restrictively, as the estimate based on dominant markers would be biased below the
real number.
(B) Average number of alleles per locus: It is the sum of all the detected alleles in all loci,
divided by the total number of loci. This parameter, which provides complementary information
to that polymorphism, is given by:
N= ( ) k / 1

=
k
i 1
n
i

Where: k = Number of loci and n
i
= Number of alleles detected by locus
This parameter is best applied in the case of codominant markers as dominant markers do not permit
the detection of all alleles.
(C) Effective number of alleles: This measure, which explains about the number of alleles that
would be expected in a locus in each population, is given by:
A
e
= 1/(1 h) = 1/p
i
2

Where, pi = frequency of the i
th
allele in a locus and h = 1 p
i
2
= heterozygosity in a locus.It
ranges from 0 to 1. It can be calculated for both dominant and co-dominant markers. By taking
allele frequencies into account, this descriptor of allelic richness is less sensitive to rare alleles.
This parameter plays a fundamental role in verification of sampling strategies. However, its
calculation is affected by the sample size.

(D) Observed Heterozygosity: A population's heterozygosity is measured by first
determining the proportion of genes that are heterozygous and the number of individuals that are
heterozygous for each particular gene. For a single gene locus with two alleles, the Observed
Heterozygosity (H
o
) is calculated as follows:
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 41

H
o
= Number of heterozygotes at a locus
Total number of individuals surveyed
Derivations of the above formula are used to calculate the H
O
when there are more than two
alleles for a particular locus, which is particularly common when microsatellite or simple sequence
repeat (SSR) markers are applied for analysis of populations.
(E) Expected Heterozygosity: The Expected Heterozygosity (H
e
) is defined as the estimated
fraction of all individuals that would be heterozygous for any randomly chosen locus. It is the
probability that, at a single locus, any two alleles, chosen at random from the population, are different
to each other. For a locus j with I alleles, It is calculated as:

h
j
= 1 p
i
2

Where, h
j
= heterozygosity per locus and p = allele frequencies
H
e
differs from the H
o
because it is a prediction based on the known allele frequency from a sample of
individuals. Deviation of the observed from the expected can be used as an indicator of important
population dynamics.
(F) Effective Population Size: One of the many variables of population dynamics that can
influence the rate and size of fluctuation in allele frequencies is population size. Genetic drift, the
random increase or decrease of an allele's frequency, affects small populations more severely than
large ones, since alleles are drawn from a smaller parental gene pool. The rate of change in allele
frequencies in a population is determined by the population's effective population size. The effective
population size is the number of individuals that evenly contribute to the gene pool.
The actual number of individuals in a population is rarely the effective population size. This
is because some individuals reproduce at a higher rate than others (have a higher fitness), the
distribution of males and females may result in some individuals being unable to secure a mate, or
inbreeding reduces the unique contribution of an individual. The effective population size is a
theoretical measure that compares a population's genetic behavior to the behavior of an "ideal"
population. As the effective population size becomes smaller, the chance that allele frequencies will
shift due to chance (drift) alone becomes greater.
(G) Shannon index: Estimates Shannons Information Index as a measure of gene diversity. It is
based on information theory and is a measure of the average degree of "uncertainty" in predicting to
what species an individual chosen at random from a collection of S species and N individuals will
belong. This average uncertainty increases as the number of species increases and as the distribution
of individuals among the species becomes even. The proportion of species i relative to the total
number of species (p
i
) is calculated, and then multiplied by the natural logarithm of this proportion
(lnp
i
) in order to obtain the Shannons Index (H).
H=-

=
S
i 1
(p
i
In p
i
)
It can be shown that for any given number of species, there is a maximum possible H, H
max
= lnS
which occurs when all species are present in equal numbers. When Shannon index is near 1, it can be
concluded that the population is highly heterozygous.
(H) Inbreeding and Relatedness: Small effective population size can result in a high occurrence
of inbreeding, or mating between close relatives. One of the effects of inbreeding is a decrease in the
heterozygosity (increase in homozygosity) of the population as a whole, which means a decrease in
the number of heterozygous genes in the individuals. This effect places individuals and the population
at a greater risk from homozygous recessive diseases that result from inheriting a copy of the same
recessive allele from both parents. The impact of accumulating deleterious homozygous traits is
called inbreeding depression - the loss in population vigor due to loss in genetic variability.
Wright (1951) developed a set of parameters called F-statistics. The inbreeding coefficient
(F
IS
) defined as the probability that two homologous (same) alleles present in the same individual are
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 42

identical by descent. F
IS
is calculated by comparing the expected heterozygosity (H
e
) with observed
heterozygosity (H
o
), and ranges from -1 (no inbreeding) to +1 (complete identity). If the values for
both observed and expected heterozygosity are the same, F
IS
will be zero. A positive value indicates
that there is an increased number of homozygotes, and population may be inbred - the larger the
number, the greater the extent of inbreeding. A negative value indicates that there are more
heterozygous individuals than would be expected; this might happen for the first few generations after
two previously isolated populations become one.

The relationships among the F statistics can be deduced through the following:
(1 - F
IT
) = (1 F
IS
)(1 F
ST
)
F
IT
= 1 (H
I
/H
T
)
F
IS
= 1 (HI/HS)
F
ST
= 1 (H
S
/H
T
)
Where, H
T
= total gene diversity or expected heterozygosity in the total population as estimated from
the pooled allele frequencies, HI = intrapopulation gene diversity or average observed heterozygosity
in a group of populations, and H
S
= average expected heterozygosity estimated from each
subpopulation.

These statistical indices measure:
F
IS
= the deficiency or excess of average heterozygotes in each population
F
ST
= the degree of gene differentiation among populations in terms of allele frequencies
F
IT
= the deficiency or excess of average heterozygotes in a group of populations
The chi-square test can be used to statistically analyze whether the difference between the
observed and expected is not likely due to chance. If there is a significant increase in the expected
number of heterozygotes, inbreeding can be ruled out as a possible population dynamic that is
influencing the genotype frequencies.
Corrections for Sampling Error:
There are two sources of allele frequency difference among subpopulations in a sample:
(i) Real differences in the allele frequencies among our sampled subpopulations (ii) Differences that
arise because allele frequencies in our samples differ from those in the subpopulations from which
they were taken.
Nei and Chesser (1983) described the G
ST
approach to account for the sampling error. G
ST
is
an interpopulation differentiation measure when multiple loci are used for analysis. It measures the
proportion of gene diversity that is measured among populations, when a large number of loci are
sampled.

G
ST
= D
ST
/ H
T,
where, D
ST
= interpopulation diversity,
H
T
= total diversity (H
S
+ D
ST
),
Hs = intrapopulation genic diversity, and
D
ST
= H
T
H
S
.
Because of the complexity of its components, calculation of G
ST
requires specialized
computer software. It can be used with codominant markers and restrictedly with dominant markers,
since it is a measure of heterozygosity. Weir and Cockerham (1984) described another statistic, u,
which incorporates an important source of sampling error ignored by G
ST
.
Measurement of genetic distance
Various genetic distance measures have been proposed for analysis of molecular marker data,
depending on whether the markers are dominant or co-dominant. For dominate markers, the total
number of bands is conventionally set as the number of analyzed loci. For co-dominant markers,
genetic similarity between two individuals number of alleles per locus determined for total collection,
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 43

is in general higher than two, Opposite to the 1- and 0- allele for dominant markers. Generally,
genetic distance in codominant markers are based on allele frequencies.
If we assume that a = 3, b = 1, c = 3 and d = 2 then:
(i) Dice and Nei and Li: a/[a + (b + c)/2] =0.6
(ii) Jaccard: a/(a + b + c) = 0.49
(iii) Sokal and Sneath: a/[a +2(b + c)] = 0.273
(iv) Roger and Tanimoto : (a + d)/[a + d + 2(b + c)]= 0.385

The Jaccard coefficient only count bands present for either individual and treats double
absences as missing data. If false-positive or false negative data occur, the index estimate tends to be
biased. It can be applied with co-dominant marker data. Nei and Li coefficient counts the percentage
of shard bands among two individuals and gives more weight to those bands they are present in both.
It considers that absence has less biological significance, and so this coefficient has complete
meaning in terms of DNA similarity. It can be applied with codominant marker data (RFLP, SSR).
Multivariate analysis
One of the main concerns of plant breeders is to quantify the degree of dissimilarity in
genetic resources, since knowledge concerning genetic distances is necessary for optimum
organization of gene banks and for identifying parental combinations that produce progenies with
maximum genetic variability, thereby increasing the chances of obtaining superior individuals
(Mohammadi and Prasanna, 2003). Use of multivariate statistical algorithms is considered an
important strategy to quantify genetic similarity. Multivariate analysis is based on the statistical
principle of multivariate statistics, which involves observation and analysis of more than one
statistical variable at a time. Multivariate techniques permit standardization of multiple types of in-
formation of a set of characteristics. The most widely used algorithms are principal component and
canonical variable analysis, as well as clustering methods
The principle of clustering methods is to join genotypes into groups, so that there is
uniformity within and heterogeneity among groups. These methods depend on previous estimates of
dissimilarity measures derived from discrete and continuous (or categorical) variables. These
categorical variables can be defined as binary, nominal or ordinal. Among grouping methods,
hierarchical clustering has been used most frequently, particularly the single linkage (SL) and
unweighted pair group method using arithmetic averages (UPGMA) methods. The reliability of
clustering methods depends on the magnitude of the cophenetic correlation, which is the association
between the genetic distance matrix and the matrix based on genotype grouping. SL consider absence
corresponds to homozygous loci, it can be used with dominate marker (RAPD, AFLP) because
absence could corresponds to homozygous recessives. UPGMA is most commonly method for cluster
analysis, UPGMA can only be used when the evolutionary rate is nearly same for all groups included
in the study, when studying the genetic diversity of germplasm collection, SL method should be
preferred above the UPGMA clustering method, because genetic difference among accessions in
germplasm are dominantly determined by selection and breeding rather than by evolutionary forces.
Resampling is a term used in statistics for bootstrapping and permutation these procedures
can be used in genetic diversity studies to assign confidence to the presence of clusters in a
dendrogram. Bootstrapping is a statistical method for estimating the sampling distribution of an
estimator by sampling with replacement from the original sample, major purpose of bootstrapping is
deriving robust estimates of standard errors and confidence intervals of population parameters. A
permutation test is type of statistical significant test in which a reference distribution is obtained by
calculating all possible values of the test statistic under rearrangements the tables on the observed
data points.
Steps involved in analysis of molecular marker data
Three main steps are involved in the statistical analysis of molecular data in diversity studies:
A. Data collection: The data on molecular markers is recorded in the following two forms:
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 44

(a) Binary data: presence or absence of molecular marker bands
(b) Allelic data (based on allele size)
B. Data analysis using univariate and multivariate statistical approaches
C. Interpretation of the data.
Each step in the process should follow a standardized format if the output of one diversity
study is to be compared to other studies and inferences drawn in this manner.
Software programs for analyzing genetic diversity
Many software programs for molecular population genetics studies have been developed; the
important ones are given below:
(i) CONVERT (http://www.agriculture.purdue.edu/fnr/html/faculty/rhodes/
students%20and%20staff/glaubitz/software.htm)
CONVERT is a user-friendly, 32-bit Windows program that aids conversion of diploid
genotypic data files into formats that can be directly read by a number of commonly used population
genetic computer programs: GDA, GENEPOP, ARLEQUIN, POPGENE, MICROSAT, PHYLIP and
STRUCTURE (Glaubitz, 2004). In addition, CONVERT can be used to produce a table of allele
frequencies in a convenient format, allowing the visual comparison of allele frequencies across
populations. The input file for CONVERT follows a 'standard' format that can be easily obtained via an
EXCEL file containing the genotypic data. CONVERT can also read in input data files in GENEPOP
format. CONVERT works on Windows 95/98/NT/2000/XP platforms.
(ii) ARLEQUIN (http://cmpg.unibe.ch/software/arlequin/)
Released first in 1997, Arlequin is a freely available integrated population genetics software
environment (Schneider et al., 1997). It is able to handle both large samples of molecular data
(RFLPs, DNA sequences, microsatellites) and also conventional genetic data (standard multi-locus
data or allele frequency data). Molecular data can be entered as DNA sequences, RFLP haplotypes,
microsatellite profiles, or multilocus haplotypes. The graphical interface is designed to allow users to
rapidly select the different analyses they want to perform on their data.
The data format is specified in an input file. The user can create a data file from scratch,
using a text editor and appropriate keywords, or use the Project Outline Wizard. Data can be
imported from files created for other programs, including MEGA, BIOSYS, GENEPOP, and
PHYLIP. Missing or ambiguous data can be included. A very detailed user manual is available, which
includes a large amount of theoretical information, formulae, and references. A large number of data
can be analysed, and a Batch Files option is also available
(iii) POWERMARKER (http://statgen.ncsu.edu/powermarker/)
PowerMarker was designed specifically for the use of SSR/SNP data in population genetics
analyses (Liu, 2003). Data can be imported from Excel or other formats, making data set-up very
easy. Data can also be exported to NEXUS and Arlequin formats. It includes a 2D viewer for
linkage disequilibrium visualization. The user can edit graphics within PowerMarker or export them
for publication. The program has been tested extensively for accuracy and efficiency. Full
documentation is included. Several new modules for association study are included in the package.
Several demonstration datasets are available to get started. The program is free, but requires having
PHYLIP, TreeView and the Microsoft.net framework system (all freely available) and Excel 2000
(not free). Another disadvantage is that it is available only for Windows 98 and above (not for
Macintosh or other systems).
(iv) PAUP (http://paup.csit.fsu.edu/)
PAUP is widely used for inferring and interpreting evolutionary trees (Swofford, 2002). It
originally meant Phylogenetic Analysis Using Parsimony, but now has many other options. Although
not free, it is relatively inexpensive and available from Sinauer Associates, Sunderland, MA. A new
version, 4.0 beta, has been released as a provisional version. Macintosh, PowerMac, Windows and
Unix/OpenVMS versions are available; the Mac version has some extra features. The Windows
version runs as a GUI application, however, unlike the Macintosh version, most options are
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 45

command-line-driven. The advantage to running PAUP under Windows is that a scrollback display
buffer is built into the program, an editor is provided, and commands are remembered between
sessions (they can be recalled, edited, etc.). It is closely compatible with MacClade (another program
available from Sinauer), since they use a common data format (NEXUS).
(v) MEGA (http://www.megasoftware.net/)
MEGA (Molecular Evolutionary Genetics Analysis) software has been widely used since its
creation in 1993. It uses DNA sequence, protein sequence, evolutionary distance or phylogenetic tree
data. It is an integrated tool for conducting automatic and manual sequence alignment, inferring
phylogenetic trees, mining web-based databases, estimating rates of molecular evolution, and testing
evolutionary hypotheses (Kumar et al., 2008). Although it was designed for the Windows platform, it
runs well on Macintosh with a Windows emulator, Sun workstation (with SoftWindows95) or Linux
(with Windows by VMWare). Online, a thorough manual is available, together with a bulletin board
to interact with other users.
(vi) GENEPOP (http://genepop.curtin.edu.au/)
Genepop is a population genetics software package, which has options for the following
analysis: Hardy-Weinberg equilibrium, linkage disequilibrium, population differentiation, effective
number of migrants, F
st
or other correlations (Raymond and Rousset, 1995). Genepop can be used
either as a DOS-version or a Web-version. The web-version is easy to use: after choosing an option
for the analysis, the data is typed or pasted into the text window provided and the results are obtained
either by email or by viewing the output via the Web.
(vii) POPGENE (http://www.ualberta.ca/~fyeh/popgene_download.html)
POPGENE is a user-friendly window-based computer package for the analysis of genetic
variation among and within natural populations using co-dominant and dominant markers and
quantitative traits (Yeh and Boyle, 1997). This package provides the Windows graphical user
interface that makes population genetics analysis more accessible for the casual computer user and
more convenient for the experienced computer user. The current version is designed specifically for
the analysis of co-dominant and dominant markers using haploid and diploid data. It performs most
types of data analysis encountered in population genetics and related fields. It can be used to compute
summary statistics (e.g., allele frequency, gene diversity, genetic distance, F-statistics, multilocus
structure, etc.) for (a) single-locus, single populations; (b) single-locus, multiple populations; (c)
multilocus, single populations and (d) multilocus, multiple populations. The latest version also
includes the module for quantitative traits.
(viii) GDA (http://hydrodictyon.eeb.uconn.edu/people/plewis/software.php)
GDA (Genetic Data Analysis) is a programme written by Lewis and Zaykin (1999). It
computes linkage and Hardy-Weinberg disequilibrium, some genetic distances, and provides method-
of-moments estimators for hierarchical F-statistics.
(ix) GenAlEx (http://www.anu.edu.au/BoZo/GenAlEx/)
GenAlEx (' Genetic Analysis in Excel') is a user-friendly cross-platform package for
population genetic analysis that runs within Microsoft Excel (Peakall and Smouse, 2006). GenAlEx
enables population genetic data analysis of codominant, haploid and binary genetic data providing
analysis tools applicable to plants, animals and microorganisms. It has tools for importing, editing
and manipulating raw genotype and sequence data from automated sequencing or genotyping
software. New 2D spatial autocorrelation procedures have been incorporated in addition to the
existing wide range of spatial analysis options. Pairwise relatedness among individuals can be
estimated. There are tools for genetic tagging applications, including location of matching genotypes
and calculation of probabilities of identity. Data export options to a host of other population genetic
software packages are also available.
(x) TFGPA (http://www.marksgeneticsoftware.net/tfpga.htm)
TFGPA (Tools for Population Genetic Analyses) is a Windows program for the analysis of allozyme
and molecular population genetic data (Miller, 1997). This program calculates descriptive statistics,
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 46

genetic distances, and F-statistics. It also performs tests for Hardy-Weinberg equilibrium, exact tests
for genetic differentiation, Mantel tests, and UPGMA cluster analyses. Additional features include the
ability to analyze hierarchical data sets as well as data from either codominant markers such as
allozymes or dominant markers such as AFLPs or RAPDs.
(xi) STRUCTURE (http://pritch.bsd.uchicago.edu/structure.html)
The program structure is a free software package for using multi-locus genotype data to
investigate population structure. Its uses include inferring the presence of distinct populations,
assigning individuals to populations, studying hybrid zones, identifying migrants and admixed
individuals, and estimating population allele frequencies in situations where many individuals are
migrants or admixed. It can be applied to most of the commonly-used genetic markers, including
SNPs, microsatellites, RFLPs and AFLPs. The basic algorithm was described by Pritchard et al.
(2000).
Useful internet resources
The following are a list of Internet resources containing links to useful information pertaining
to genetic diversity analysis, population genetics and other software available:
(i) An alphabetical list of genetic analysis software from the North Shore LIJ Research
Institute (http://linkage.rockefeller.edu/soft/list1.html) contains a list of 520 programmes.
Computer software on genetic linkage analysis for human pedigree data, QTL analysis for
animal/plant breeding data, genetic marker ordering, genetic association analysis, haplotype
construction, pedigree drawing, and population genetics are included here.
(ii) Phylogeny Programs (http://evolution.genetics.washington.edu/ phylip/software.html)
contains links to 365 phylogeny packages and 51 free web servers. Updates to these pages are
made monthly. Many of the programs in these pages are available on the web, and some of
the older ones are also available from ftp server machines. The programs listed below include
both free and non-free ones. The packages are sorted in various ways (e.g. by methods,
system used, analyzing particular kind of data, most recent etc.).
(iii) Maize Genetics site (http://www.maizegenetics.net/bioinformatics) from Cornells Institute of
Genomic Diversity contains freely available software programme to evaluate linkage
disequilibrium, nucleotide diversity, and trait associations
(iv) The European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EBI) site
(http://www.ebi.ac.uk/) contains links to many useful programs and other sites.
(v) Mathematical Genetics and Bioinformatics Site, University of Chicago
(http://mathgen.stats.ox.ac.uk/software.html)
(vi) Statistical genetics and Bioinformatics Site, North Carolina State University
(http://statgen.ncsu.edu/brcwebsite/software_BRC.php) contains softwares for genetic data
analysis developed and made available by researchers at or affiliated with the Bioinformatics
Research Centers.
Conclusion
The analysis of genetic diversity within a species is imperative for gaining an insight into the
process of evolution of the species at the population level. Many statistical packages and computer
programmes are currently available for analyzing molecular data for assessment of genetic diversity.
Most programs perform similar tasks and many of them are freely downloadable from the internet.
The programmes, however, differ from each other in the type of marker they can handle, the manner
in which the raw data is formatted and also in how the users select the details of the computations to
be performed. Many of these programmes use a specific data-file format, but several of these
programmes offer the possibility to read or write data from, or to, other file formats. Many of these
programmes possess user-friendly and sophisticated graphical interfaces which helps the users to
easily select the type of analyses to be performed and to set up computational parameters. Currently,
researchers are directing their efforts on development of newer programmes using more specialized
methodologies.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 47

References
Glaubitz, J.C. (2004) convert: A user-friendly program to reformat diploid genotypic data for
commonly used population genetic software packages. Molecular Ecology Notes, 4: 309-310.
Liu, J. (2003) PowerMarker: New Genetic Data Analysis Software, Version 3.0. Free program
distributed by author over Internet.
Kumar, S., J. Dudley, M. Nei and K. Tamura (2008) MEGA: A biologist-centric software for
evolutionary analysis of DNA and protein sequences. Briefings in Bioinformatics, 9: 299-306.
Miller, M.P. (1997) Tools for population genetic analysis (TFPGA) 1.3:AWindows program for the
analysis of allozyme and molecular population genetic data. Distributed by the author.
Mohammadi, S.A. and B. M. Prasanna (2003) Analysis of Genetic Diversity in Crop Plants- Salient
Statistical Tools and Considerations. Crop Science, 43:12351248.
Nei, M. and R.K. Chesser (1983) Estimation of fixation indices and gene diversities. Annals of
Human Genetics, 47:253259.
Nei, M. and W. Li (1979) Mathematical model for studying genetic variation in terms of restriction
endonucleases. Proceedings of National Academy of Sciences (USA), 76:52695273.
Nei, M., and R.K. Chesser (1983) Estimation of fixation indices and gene diversities. Annals of
Human Genetics, 47:253259.
Peakall, R. and P. E. Smouse (2006) GENALEX 6: genetic analysis in Excel. Population genetic
software for teaching and research. Molecular Ecology Notes, 6: 288-295.
Pritchard, J.K., M. Stephens, and P. Donnelly (2000) Inference of population structure using
multilocus genotype data. Genetics, 155:945959.
Raymond, M., and F. Rousset (1995) GENEPOP (version 1.2): Population genetics software for exact
tests and ecumenicism. Journal of Heredity, 86:248249.
Schneider, S., D. Roessli and L. Excoffier (2000) ARLEQUIN, version 2.00-software for population
genetics data analysis. Genetics and Biometry Laboratory, University of Geneva,
Switzerland.
Swofford, D.L. (2002) PAUP*: Phylogenetic Analysis Using Parsimony (*and Other Methods),
Version 4. Sinauer Associates, Sunderland, MA.

Weir, B.S. and C.C. Cockerham (1984) Estimating F-statistics for the analysis of population
structure. Evolution, 38:13581370.
Wright, S. (1951) The genetical structure of populations. Annals of Eugenics, 15: 323-354.
Yeh, F.C. and T.J.B. Boyle (1997) Population genetic analysis of co- dominant and dominant markers
and quantitative traits. Belgian Journal of Botany, 129:157.
















National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 48

RAPD and ISSR Analysis
Ritto Paul, Sayuj K.P and K. Nirmal Babu
Indian Institute of Spices Research, Marikunnu P.O., Calicut- 673012.
Principle
Many different methods and technologies are available for the isolation of genomic DNA. In general,
all methods involve disruption and lysis of the starting material followed by the removal of proteins
and other contaminants and finally recovery of the DNA. Removal of proteins is typically achieved
by digestion with proteinase K, followed by salting-out, organic extraction, or binding of the DNA to
a solid-phase support (either anion-exchange or silica technology). DNA is usually recovered by
precipitation using ethanol or isopropanol. The choice of a method depends on many factors: the
required quantity and molecular weight of the DNA, the purity required for downstream applications,
and the time and expense. Several of the most commonly used methods are detailed below, although
many different methods and variations on these methods exist. However, they usually lack
standardization and therefore yields and quality are not always reproducible. Reproducibility is also
affected when the method is used by different researchers, or with different sample types. The
separation of DNA from cellular components can be divided into four stages:
1. Disruption
2. Lysis
3. Removal of proteins and contaminants
4. Recovery of DNA
Standardized Protocol for DNA isolation for Spices.
- Lyophilize 200-300 mg of fresh leaf material.
- Grind 20 mg of lyophilized leaf material to a fine powder using quartz sand using pestle and
mortar.
- Transfer the powdered material to 700 l of pre-warmed Extraction buffer and 700 l of 2X
CTAB buffer and incubate for 60 min at 60 C with occasional stirring.
- Extract with equal volume of Phenol: Chloroform: Isoamyl alcohol (25:24:1).
- Centrifuge at 10,000 rpm for 15 min at room temperature (20 C).
- Separate the aqueous phase and transfer to a fresh tube.
- Add 2 l 0f RNase A (10 mg/ml) to final concentration of 50 mg/ml and incubate for 30 min
at 37 C.
- Extract with an equal volume of chloroform: isoamyl alcohol (24:1) at 10,000 rpm for 10
min.
- To the aqueous phase add 0.6 volumes of ice-cold isopropanol and incubate at -20C for 30-
60 min.
- Centrifuge at 10,000 rpm for 10 min at 4 C. Wash the DNA pellet obtained with 70%
ethanol and 10 mM ammonium acetate.
- Dry the DNA pellet and dissolve in 100 ml of water or low concentration TE buffer.


Quantification of DNA
i. Agarose gel electrophoresis
- Attach tape to the ends of the gel tray. Position the well-forming comb and ensure that the gel
tray is horizontal.
- Prepare 0.8% Agarose gel by adding 0.8 gm agarose in 100 ml of 1x TAE and gently boil the
solution in microwave oven with occasional mixing until all agarose particles are completely
dissolved. Allow it to cool to 60C and add 0.1g/ml to 0.5g/ml ethidium bromide. Pour agarose
onto the gel tray and allow the gel to set.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 49

- Remove the comb and tape. Place the gel into the electrophoresis tank and pour 1x TAE until the
gel is fully immersed.
- Load the DNA sample wit 6 x loading dye in to the wells. In one well load a standard marker.
- Carry out the electrophoresis at 5-6 V/cm gel until the dye is 4-5 cm from the wells.
- Visualize the DNA bands on a UV transilluminator or in Gel Documentation System.

ii. DNA quantification by UV spectroscopy
- Take 5l of the DNA samples in a quartz cuvette. Make up the volume to 1 ml with distilled
water.
- Measure absorbance of the solution at wavelengths 260 and 280 nm.
- Calculate the ratio A280/A260.
- A good DNA preparation exhibits this ratio < 0.55 O.D units.
- Calculate DNA concentration using the relationships for soluble standard DNA, 1 O.D at 260 nm
=50g/ml. This estimate is influenced by the contaminating substances like RNA and very low
molecular weight DNA in the solution.

Randomly Amplified polymorphic DNA (RAPD) analysis
Randomly amplified polymorphic DNAs (RAPDs) are well suited to high through put
system, required for plant genetic analysis because of its simplicity, speed, low cost requirement of
smaller quantities of genomic DNA and relative abundance of the marker in the genome. This is a
PCR based technique in which single PCR primer of ten nucleotides in length will find homologous
sequences in the genome, by chance and will amplify several regions of the genome, if the primer is
annealed within the reasonable distance that can be amplified by Taq DNA polymerase and also in
correct orientation. RAPDs are dominant marker, which cannot differentiate the homozygotes from
the heterozygotes.
The primer used in the RAPD reaction possesses the base sequences, which is arbitrarily
defined. In this marker system the investigator have no idea to which, if any gene or repeated
sequence in the plant genome, the primer may have homology. Any band after the RAPD reaction
resolved in an ethidium bromide stained agarose gel or silver stained polyacrylamide gel can be used
as the raw data for comparison of plant genome.

Inter Simple Sequence Repeat (ISSR) analysis
The Inter-Simple Sequence Repeat marker (ISSR, anchored microsatellite) use simple
sequence repeats anchored at the 5' or 3' end by a short arbitrary sequence as PCR primers
(Zietkiewicz et al, 1994). This generates multilocus markers. It is a simple and quick method that
combines most of the advantages of microsatellites (SSRs) and amplified fragment length
polymorphism (AFLP) to the universality of random amplified polymorphic DNA (RAPD).
ISSR markers are highly polymorphic and are useful in studies on genetic diversity,
phylogeny, gene tagging, genome mapping and evolutionary biology. ISSRs are ideal markers for
genetic mapping and population studies due to their abundance and the high degree of polymorphism
between individuals with a population of closely related genotypes.

Optimization of reaction conditions should precede the actual RAPD and ISSR analysis
to get repeatable results. Following optimizations are essential:
- Template DNA concentration.
- Taq DNA polymerase concentration.
- Mg
2+
ion concentration.
- Primer concentration.
- Primer annealing temperature.
Labo

-

Mate

polym
and 1
initial
denatu
at 72

TAE b

Data
coeffi
System
dendr










Refer

Rohlf
Samb
Zietki













Natio
ratory Manu
Primers su
rial and reag
Amplificat
merase buffer
U Taq DNA
l step of 94
uration, 45 s
C for elonga
Amplificat
buffer (pH 8.
Ampl
were analy
icients for bi
m program
rograms can b



rences
f, F.J. (1999)
2.02i. Exet
brook, J., Frits
I. 2
nd
editio
iewicz, E., R
repeats (S
17683.
Figure showing
onal training o
ual
uitable for det
gents:
tion was perf
, 50 ng of tot
A Polymerase.
C for 5 m
at (37C for R
ation. A 10 m
tion products
0) stained wit
lified product
sed based o
nary data via
Package for
be constructed

). NTSYS-pc
ter Software,
sch, E.F. and
on. Cold Sprin
Rafalski, A. a
SR)-anchored
g the ISSR profili
on Allele Mi

tection of poly
formed in a to
tal cellular D
. PCR amplif
mins, followed
RAPD analys
mins step at 72
s were separa
th Ethidium B
ts were score
on the Jacca
a SIMQUAL
r PC (NTSY
d based on the

c Numerical
Setauket, Ne
d Maniatis, T.
ng Harbor Lab
and Labuda.
d polymerase
ing of Black pepp
ining 12
th
- 2

ymorphic loci
otal volume o
DNA, 150M
fication were
d by 35 cyc
sis and 50 C
2 C is progra
ated by electr
Bromide as de
ed as present
ards Sorens
of the Num
YS-pc ver. 2
e analysis of t




Taxonamy a
ewyork.
. (1989). Mole
boratory Pres
D. (1994). G
e chain react
pper
25
th
Sept, 201

i in the taxa t
of 25 ml incl
dNTP mix,
performed in
cles, each on
for ISSR Ana
ammed as a fin
trophoresis on
escribed by S
t (1) or absen
sen-Dice and
merical taxono
2.021i Packa
the data.
and Multivari
lecular Clonin
ss.
Genome fing
tion amplific
11, IISR, Cali

to be analysed
luding 2.5l o
1.5 mM MgC
n a thermocy
ne including
alysis) for ann
nal extension
n 1.5% agaro
Sambrook et a
nt (0) to form
d Simple m
omy and Mu
age) (Rohlf,
iate Analysis
ng: A Labora
gerprinting by
cation, Genom
icut

d.
of 10 X Taq
Cl
2
, 0.4 M p
ycler as follow
45s at 94
nealing and 2
n.
ose gel and i
al (1989).
m a binary m
matching simi
ultivariant An
1993). UP
s System. Ve
atory Manual
y simple seq
mics, Vol. 20
50
DNA
primer
ws: an
C for
2 mins
in 1X
matrix.
ilarity
nalysis
GMA
ersion
l, Vol.
quence
0, pp.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 51

Microsatellite (simple sequence Repeats) Profiling
Anucyriac, Anupama K. Rittopaul, Rahul P.R
Indian Institute of Spices Research, Marikunnu P.O., Calicut- 673012.
Simple sequence repeats (SSRs) also called microsatellites are stretches of DNA consisting of
tandemly repeating mono, di, tri, tetra or penta nucleotide units that are arranged throughout the
genomes of all prokaryotic and eukaryotic genomes analysed to date (Powell et al., 1996; Zane et al.,
2001). SSR loci harbor considerable length variation and are extremely abundant. The origin of such
polymorphism is appears most likely to be due to slippage events during DNA replication
(Schlotterer & Tautz 1992).They are individually amplified by polymerase chain reaction from total
genomic DNA, using a pair of oligonucleotide primers specific to the DNA flanking the SSR
sequence and hence define the microsatellite locus. Amplification products obtained from different
individuals can be resolved on gels to reveal polymorphism. The amplified products usually exhibit
high levels of length polymorphisms, which result from variation between alleles in the number of
tandemly repeating units of the locus (Tautz, 1989; Weber and May, 1989). Microsatellites have
proven to be an extremely valuable tool for genome mapping in many organisms (Schuler et al.1996;
Knapik et al. 1998), but their applications span over different areas ranging from ancient and forensic
DNA studies, to population genetics and conservation/management of biological resources ( Jarne &
Lagoda 1996).The advantages of microsatellites are that they are relatively abundant with uniform
genome coverage, high variable codominant, robust and reproducible, easy to detection by PCR,
represent sequence tagged sites and require only small amount of starting DNA. Their high
information content, which is directly related to the effective number of alleles at each locus and the
ease of automating the PCR assays for identifying the Simple Sequence Repeat polymorphisms make
SSRs ideal genetic markers. But there is considerable difficulty in generating SSR markers compared
to others as cloning and sequence information is necessary.
The traditional method for isolation of SSRs involve - Creation of small insert genomic
library, Screening of library for presence of microsatellites, sequencing of the positive clones, primer
design and locus specific analysis and identification of polymorphisms (Rafalski et al., 1996). A
further class of isolation methods is based on selective Hybridization which appears to be extremely
popular for isolation of microsatellites (Zane et al 2001). The basic protocol involves restriction
digestion of the DNA into small fragments, Selective hybridization using biotinylated
oligonucleotides, capture the microsatellite containing regions using magnetic beads , cloning of the
DNA fragments, Sequencing of positive clones and primer designing (Armour et al., 1994; Kijas et
al., 1994;Glenn et al 2005).

ATTTGTATTT TACAACACCT CACATGCTCA GTTATTTGGT TCATATGCAA
Forward Primer
GTCTCGGTTT TGGTCTCTGC TCAGAAAAAG AGAGAGAGAG AGAGAGAGAG
Reverse Primer

AGAGAGAGAG AGAGAGAGAA GAAATTTGCA GTTAATTGTC AAGTAGAAGT
Fig. 2. Soyabean library derived microsatelllite (AG)
20


Because of their sensitivity to minor genetic differences, PCR-based markers such as AFLPs and
microsatellites are likely to remain key molecular tools for some time to come.

Protocol for developing microsatellite profiles
Microsatellites can be amplified with specifically designed primers, if available for crop in
question, using PCR and can be resolved on either acrylamide or high quality Agarose gels both
radioactive as well as non-radioactive methods can be used.
A simple method using non-radioactive PCR and polyacrylamide gel electrophoresis is given below;
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 52

PCR Amplification: Select the DNA of the population that is to be studied. Prepare PCR for each
of the genotypes in the following method:
Per reaction x 10
10x PCR buffer (without Mgcl
2
): 2.5 l 25 l
Mgcl
2
(25mM)

2l 20l
dNTPs 2.5 mM each 1l 10l
Forward Primer (5M ) 2.5l 25l
Reverse Primer (5M) 2.5l 25l
Sterile H
2
O 13.4l 134l
Taq Polymerase 0.5 Unit (5U/ l) 0.1l 10l
Mix thoroughly, distribute 24l into each PCR tube and add
DNA: 15-25ng, 1 l (20ng /l) to each of the tube. Total reaction volume 25l
Follow standarad PCR with
Initial denaturation: 94
0
C for 2 min 1 cycle
Denaturation, annealing and primer extension: 94
0
C for 30 seconds 35 cycles
50-60
0
C for 30seconds
72
0
C for 1 min
Final Extension 72
0
C for 5 min 1 cycle

Electrophoresis
Resolve the amplification product in 3% Metaphor Agarose gel in 1x TBE or using 6-8%
polyacrylamide gel in 1X TBE. Use the profile for analysis.
SSR Advantages:
- Co-dominant (more informative when dealing with heterozygotes)
- Highly variable (important for species with narrow gene pools)
- Widely used
- Excellent for use in marker assisted selection, fingerprinting and marker assisted
backcrossing
Neutral Polyacrylamide Gel Electrophoresis
These gels are used for the separation and purification of fragments of double- stranded
DNA. They will migrate through non-denaturing polyacrylamide gels at rates that are inversely
proportional to the log
10
of their size. The mobility is also affected by their base composition and
sequence, so that duplex DNAs of exactly the same size can differ in mobility up to 10%. Monomers
of acrylamide are polymerized into long chains in a reaction initiated by free radicals. In the presence
of N, N methylenebisacrylamide, these chains become cross linked to form a gel. The porosity of
the resulting gel is determined by the length of chains and degree of cross linking that occurs during
the polymerization reaction.
Materials
1. TBE 10X (500ml)
Trizma base 54g
Boric acid - 27.5g
0.5M EDTA, pH 8.0 20.0ml
2. 40% Acrylamide/ bisacrylamide (29:1) solution
Acrylamide 38.62g
Bisacrylamide 1.38g
Add water to obtain a final volume of 100ml;store at 4C
3. 10% Ammonium per sulfate (APS)
Dissolve 0.1g of APS in 1ml distilled water., Store at 4C
4. KOH/Methanol solution (10%w/v)
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 53

This solution is for cleaning the glass plates used to cast sequencing gels. It is prepared by
dissolving 5g of KOH pellets in100 ml of methanol. Store the solution at room temperature in atightly
capped glass bottle.
5. TEMED
Electrophoresis grade TEMED available from commercial suppliers.
6. Ethidium Bromide. (10mg/ml) 1%stock.
Add 1g of ethidium bromide to 100 ml of water. Stir on a magnetic stirrer for several hours to
ensure that the dye has dissolved. Wrap the container in aluminum foil or transfer the solution to a
dark bottle and store at room temperature.
7. Loading Dye(6X)
Sucrose (40%) or Glycerol(30%) = 4gm or 3gm
Bromophenol blue (0.25%) = 0.025gm
Xylene cyanol (0.025%) =0.025gm
Make upto 10ml with distilled water.

Methods
Assembling the apparatus and preparing the Gel solution
1. If necessary, clean the glass plates and spacers with KOH/methanol.
2. Wash the glass plates and spacers in warm detergent solution and rinse them well, first in tap
water and then in deionized water. Hold the plates by the edges or wear glows, so that oils
from the hand do not become deposited on the working surface of the plates. Rinse the plates
with ethanol and set them aside to dry.
3. Assemble the glass plates with spacers:
a, Lay the larger (or un notched) plate flat on the bench and arrange the spacers at each side
parallel to the two edges.
b, Lay the inner (notched) plates in position, resting on the spacer bars.
c, Clamp the plates together with binder or bulldog paper clips and bind the entire length of
the two sides and the bottom of the plates with gel-sealing tape to make a water tight seal
4. Taking into account the size of the glass plates and the thickness of the spacers, calculate the
volume of gel required. Prepare the gel solution with the desired polyacrylamide percentage .
Add the following into a beaker to prepare 60 ml of 8% polyacrylamide gel and swirl it for
mixing.
40%Acrylamide solution - 12.0 ml
10XTBE - 6.0 ml
10%APS - 300.0l
TEMED - 125.0l
Distilled water - 41.58ml
5. Expel the gel solution to the assembled plates, avoiding air bubbles and filling almost to the
top.
6. Once the solution is filled up insert the comb at the top of gel without creating air bubbles.
Allow 60 min for the gel to polymerize.
7. After polymerization is complete, surround the comb and the top of the gel with paper towels
that have been soaked in 1 X TBE. Then seal the entire gel in Saran Wrap and store it at 4c
until needed (may be stored for 1-2 days in this state before used).
8. When ready to proceed with electrophoresis, carefully pull the comb from the polymerized
gel and remove the gel sealing tape from the bottom of the gel .
9. Remove any excess polyacrylamide from around the comb and top of the glass plates with
razor blade. Clean the plates with paper towels.
10. Add 1X TBE in the bottom chamber or as per the capacity of the chamber.
11. Fit the gel assembly into the apparatus and fill the upper tank with required quantity of 1X
TBE buffer. Remove any air bubble from the top of the gel.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 54

12. Add tracking dye; 2l of 6X dye for 10l of PCR product and mix it well.
13. Flush the wells with IX TBE buffer. Load the samples into the wells as per the type of comb
used. Generally 1-2 l of the sample is loaded. If the concentration of PCR product is high,
smaller quantity should be loaded for better resolution of DNA fragments.
14. Each gel must be loaded with DNA size markers (50bp or 100bp as per requirement).
15. Start electrophoresis with constant voltage of 80V for 5h (Low voltage with longer duration
helps in the finer separation of closely sized markers).
16. Run the gel until the marker dyes have migrated the desired distance. Turn off the power,
disconnect the leads, and discard the electrophoresis buffer from the reservoirs.
17. Detach the glas plates. Use a spacer or plastic wedge to lift a corner of the upper glass plate.
Check the gel remains attached to the lower plate. Pull the upper plate smoothly away.
Remove spacers.
18. The gel is taken out and stained in the tray containing 20l of ethidium bromide (1.0% stock
solution) in 1 litter of distilled water for 5 minutes.
19. The tray is constantly shaken in the horizontal shaker to maintain the uniformity of the
solution.
20. The gel is taken out and destained in double distilled water for 20 minutes.
21. After destaining the gel is analysed using a Gel Doc imaging system .

References
Armour, J. A.L., Newmann, R., Gobert, S., Jefferys, A. J., 1940. Isolation of human simple repeat
loci by hybridization selection .Human Mol.Gen.3, 599-605
Creste,S.,Tulmann,N.A.,Figueira,A., 2001 Detection of Single Sequence repeat polymorphisms in
denaturing Polyacrylamide SequencingGels by Silver Staining. Plant Molecular Biology
Reporter 19,299 - 306
Glenn, T. C., Schable, N.A., 2005.Isolating microsatellite DNA loci. In: Methods in Enzymology
395, Molecular Evolution : producing the biochemical data , part b( ends Zimmer EA,
Roalson EH).Academic pres, San Diego
Jarne ,P., Lagoda, P.J.L., 1996 Microsatellites, from molecules to populations and back. Trends in
Ecology and Evolution, 11,424 - 429.
Jones CJ, Edwards KJ, Castaglione S, Winfield MO, Sale F, Van de Wiel C, BredemeijerG, Buiatti
M, Maestri E, Malcevshi A, Marmiroli N, Aert R, Volckaert G, Rueda J, Linacero R,
Vazquez A and Karp A 1997 Reproducibility testing of RAPD, AFLPand SSR markers in
plants by a network of European laboratories. Mol Breed 3:381-390.
Karp, A., Seberg, O. and Buiatti, M. (1996) Molecular techniquesin the assessment of botanical
diversity, Ann. Bot. 78,143149
Kijas, J.M., Fowler, J.C., Garbett, C.A., Thomas, M.R., 1994 Enrichment of microsatellites from the
citrus genome using biotinylated oligonucleotide sequences bound to streptavidin-coated
magnetic particles. Biotechniques, 16, 656 - 662.
Lu, Z.X., 1998 Construction of a genetic linkage map and identification of AFLP markers for
resistance to root-knot nematodes in peach rootstocks, Genome 41, 199207
Lynch, M. and Walsh, B. 1998 Genetic Analysis of Quantitative Traits, Sinauer
Patterson,A.H., Tanksley, S.D., Sorrels, M.E.,. 1991. DNA markers in plant improvement. Advances
in Agronomy. Vol. 46. Academic Press. pp 40-90.
Powell W, Gordon CM and Provan J. 1996. Polymorphism revelaed by simple sequence repeats.
Elsevier Publishers 1 (7) : 215.
Powell, W., 1996 The comparison of RFLP, RAPD, AFLP and SSR (microsatellite) markers for
germplasm analysis, Mol. Breed. 2,225238.
Rafalski, J.A.,1996. Generating and using DNA markers implants, In, Analysis of Non mammelian
Genomes A Practical Guide (Birren E and Lai E eds.) Academic Press.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 55

Rosendahl, S. and Taylor, J.W. (1997) Development of multiplegenetic markers for studies of genetic
variation in arbuscular mycorrhizal fungi using AFLP, Mol. Ecol. 6, 821829
Sambrook,J and Russel,D.W .2001. Molecular Cloning: A Laboratory Manual, third ed. CSH
Laboratory Press, ColdSpring Harbor, New York.
Semblat, J.P., 1998 High-resolution DNA fingerprinting of parthenogenetic root-knot nematodes
using AFLP analysis, Mol. Ecol. 7, 119125
Schlotterer C, Tautz D 1992 Slippage synthesis of simple sequence DNA. Nucleic Acids Research,
20, 211 - 215.
Swapna ,M.,Sivaraju,K.,Sharma,R,K.,Singh,N.K.,Mohapatra,T.,2010. Single-Strand conformational
Polymorphism of EST- SSRs:A potential Tool for Diversity Analysis and varietal
Identification in Sugarcane
Tautz,D., 1989 Hypervariability of simple sequences as a general source for polymorphic DNA
markers. Nucleic Acids Res 17(16): 6463-6471.
Vos P, Hogers R, Bleeker M, Reijans M, Van der Lee T, Hornes M, Frijters A, Pot J, Peleman J,
Kuiper M and Zabeau M 1995 AFLP: a new technique for DNA fingerprinting.Nucleic Acids
Res 23: 4407-4414.



































Labo

Back
Ralsto
is the
plant
long p
plant
plants
analy
which
determ
amon
Thoug
across
agreem
reveal
devise
geogr
respec
Indon


























Natio
ratory Manu
kground
onia solanace
causative ag
pathogens ow
persistence in
species, inclu
s, shrubs, and
sed by severa
h is based o
mination (Hay
g the global c
gh popular, th
s the geograp
ment with ea
l the actual
ed Phylotypin
raphical origi
ctively, wher
nesia, Japan an

onal training o
ual
Multi
earum Yabuu
ent of bacteri
wing to its ag
n soil & wate
uding many e
d trees (Wick
al phenotypic
on substrate
yward 1964,
collection of R
hese techniqu
phical location
ch other. Perh
diversity ex
ng which clas
n of the strai
reas phylotyp
nd Australia.
on Allele Mi

ilocus Sequ
uchi (Smith) p
ial wilt of cro
ggressiveness
er associated e
economically
ker et al, 200
and genotyp
utilization a
Buddenhagen
Ralstonia sola
ues are not eno
ns. Besides, t
haps these an
xist in popula
ssifies Ralston
ins: phylotyp
pe III memb

ining 12
th
- 2

uence Typ
A. Kuma
IARI, New D

previously kn
op plants that
s, broad host
environments
important cro
07). The gene
ic. The pheno
ability of th
n 1962). Five
anacearum (T
ough for deci
these two ind
nomalies prom
ation of Ralsto
nia solanacea
e I and II are
bers are Afri
25
th
Sept, 201

ping of bac
ar
Delhi
nown as Pseu
is regarded a
range & wid
s. The pathog
ops including
etic diversity
otypic assay i
he bacterium
e biovars and
Table 1 & 2).
iphering the p
dependent syst
mpted the res
onia solanac
arum in to fou
e composed o
ican, and ph
11, IISR, Cali

cteria
udomonas so
as one of the i
de geographic
gen is known
g vegetables,
y of Ralstonia
includes biov
m, and host
as many race

population bio
tems of class
earchers to d
cearum. Fegan
ur phylotypes
of Asian and
hylotype IV
icut

lanacearum
important bac
cal distributio
to infect ove
spices, herba
a solanacear
var characteriz
range based
es have been
ology of the s
sification are
devise finer to
n and Prior (
s which reflec
d American st
isolates are
56
Smith
cterial
on and
er 450
aceous
rum is
zation
d race
found
strains
not in
ools to
2005)
cts the
trains,
from
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 57

Molecular tools for genetic diversity analysis: Many techniques based on the electrophoretic
mobility of the genomic fragments are in use for the analysis of population structure of R.
solanacearum isolates worldwide. The genotypic tool based on electrophoretic patter comparison of
PCR/restriction digestion generated fragments (to name few, ISSR, RAPD, Rep-PCR) among the
strains is the most popular choice in the late 1990s. These techniques which exploits, the random
amplified fragments are turned out to be NonPortable Tools due to their inherent non-reproducible
nature. Sequence based discrimination of strains such as Multilocus sequence typing (MLST) and
Comparative Genome Hybridization (CGH) which uncovers allelic variants in conserved
housekeeping and virulence genes is portable across the laboratories. Sequence data can be compared
readily between laboratories, such that a typing method based on the sequences of gene fragments
from a number of different housekeeping loci. Multilocus sequence typing approach uses sequences
of internal gene fragments and assigns different allele numbers to the sequence at each locus, so it
will provide unique allelic profile for each isolate called Sequence Types (STs). Based on this
approach Castillo and Greenberg (2007) had analyzed the evolutionary forces operating on R.
solanacearum populations using Multilocus Sequence Typing (MLST) including five housekeeping
and three virulence-related genes. R. solanacearum to be a diverse pathogen, showing high levels of
nucleotide polymorphism and a number of unique alleles in the Chromosome and in the
Megaplasmid. So far about 27 to 33 STs were identified for the eight genes by MLST based analysis.

Methodology Isolation of the R. solanacearum isolates
Bacterial wilt affected plant samples were collected from field and processed for isolation of
bacterium. The thoroughly washed stem cutting of wilted plants were allowed to ooze in a clean glass
of water for few minutes and were plated on to CPG agar amended with 2, 3, 5 triphenyl tetrazolium
chloride and incubated at 28
o
C for two to three days. The typical colonies of R. solanacearum as
indicated by their fluidal appearance with spiral pink centre were purified by repeated streaking on
fresh plates.
Preparation of bacterial cells for DNA isolation
A single colony of R. solanacearum was inoculated in a broth and incubated for about 24-36 h for
isolation of total genomic DNA.

Isolation of genomic DNA from R. solanacearum
DNA isolation
1. Density of the bacterial suspension is adjusted to OD1.0 @ 600nm
2. Spin down at 14000 rpm at room temperature for 2 min.
3. The supernatant is discarded and pellet is washed three times with sterile distilled water.
4. To the pellet 550l of TE buffer+lysozyme is added, mixed well and incubated for 30 min at 37C.
5. After incubation 76l of 10% SDS+Proteinase K is added.
6. The contents are mixed by flipping the tube and incubated for 15 min at 65C.
7. After incubation 100l of 5M NaCl is added and mixed the contents by flipping the tube.
8. Then 80l of CTAB/NaCl is added, mixed and incubated for 10 min at 65C.
9. After incubation 660l of Chloroform+isoamyl alcohol is added.
10. The contents are mixed by flipping the tube about 30sec.
11. Then centrifuge for 5 min at 14000 rpm at room temperature.
12. After centrifugation the aqueous fraction is carefully transferred to a new 1.5ml tube without
touching the white middle layer (interface). This step is repeated twice.
13. Equal volume of isopropanol is added and inverted to mix.
14. Then centrifuge for 15 min at 14000 rpm at room temperature.
15. The supernatant is gently drained and mixed with 0.5ml of 70% ice cold ethanol.
16. Centrifuge for 15min at 14000rpm at room temperature.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 58

17. After this the supernatant is carefully removed and evaporated the remaining ethanol in the
laminar flow for about one hour.
18. 25l of 10:1 TE is added to each tube to dissolve the DNA and the tubes are kept at 4
o
C for
overnight.
19. The DNA from the two tubes is pooled in one tube, so the total is 50l and added RNase to
remove the contaminating RNA at a concentration of200g/ml
20. The tubes are incubated for 30min at 30C. Then stored the DNA at -20
o
C.

Quality analysis and quantitation of genomic DNA
1. 5l of stock DNA is diluted 10 times by adding 45l of MQ water.
2. The quantity of DNA is measured using a Biophotometer
3. Quality assessed by gel electrophoresis
4. DNA concentration is adjusted to 200ng per ul of water 5. Proceed with PCR amplification of
genes

Multilocus Sequence Typing
Various steps involved in the sequence typing are given in the fig 1. Briefly the selected genes are
amplified, purified and sequenced. The sequence reads are assembled and compared with the database
for assigning the alleles. The combination of the allele numbers is unique for each strain of the
bacterium in question. The allele numbers are further compared among the strains in order to decipher
the strain migration in the field of molecular epidemiology.

Choice of loci: For the diversity analysis five housekeeping genes, which resides in the
chromosome (ppsA, phosphoenol pyruvate synthase; gyrB, DNA gyrase, subunit B; adk, adenylate
kinase; gdhA, glutamate dehydrogenase oxidoreductase; and gapA, glyceraldehyde 3-phosphate
dehydrogenase oxidoreductase) and three plasmid borne virulence related genes (hrpB, regulatory
transcription regulator; fliC, encoding flagellin protein; and egl, endoglucanase precursor) are
considered. The details of the genes, its protein and the conserved length are furnished in the Table 3.






















Fig.1.MLSTworkflow
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 59


PCR Amplification: For PCR amplification, the reaction mixture (50l) contained 50-100ng of
template genomic DNA, 1 PCR buffer, MgCl2 3mM, DMSO 6%, each dNTPs 50M, 10pmol of
each primer(Table 1), and 1 U of Taq DNA polymerase DNA was amplified using an initial
denaturation at 96C for9 min, followed by 35 cycles of 95C for 30s, appropriate annealing for 1 min
and extension 72C for 2 min. Reactions were completed with a final extension step of 10 min at
72C. All PCR products were electrophoresed through a 1.0 % agarose gel and visualized with UV
light after ethidium bromide staining.

Elution, Purification and Sequencing: The amplicon was eluted and purified using Gel Elution
kit according to the instructions given. The eluted product was sequenced in both directions and the
sequences were assembled using DNA baser software. Sequencing is carried out on each DNA strand
with BigDye Terminator Ready Reaction Mix under the following conditions, an initial denaturation
at 96C for 9 min, followed by 35 cycles of 95C for 30s, appropriate annealing for 1 min and
extension 72C for 2 min. Reactions were completed with a final extension step of 10 min at 72C.
Unincorporated dye terminators were removed by precipitation with 95% alcohol.

























Sequence analysis: Sequences are carefully analysed and sequence type assigned for each of the
strain by comparing the data sets with www.pamdb.org. The strain relation with the existing
collection of strain can be determined by eBurst programme ( http://eburst.mlst.net).

Handling sequence data: The sequencing machines would give us the chromatogram indicating
the quality of the sequence reads (Fig 2). The sequence reads are carefully observed for any errors in
the base using any one of the chromatogram viewers (eg. DNA baser, BioEdit, Chromos etc). Thus
obtained sequence is called as raw sequence (Fig.3). For each gene, two such sequences are obtained
which are known as forward sequence and reverse sequence respectively. The forward and reverse
Labo

seque
DNA
seque












>A01
CTGG
ACAA
AGCT
AGCT
CACG
GGCA
ACCG
>A01
GCAG
GAAG
CCAT
AGTT
TCCA
GTTG
TTGC
Fig. 3
seque
>FliC
GGCC
GCAG
GACC
AGCC
TGCC
GTTG
CGAG
Fig. 4

The a
conse
19 of
>CaR
GCCG
GCTG
AAGG
Natio
ratory Manu
ences are asse
baser). The
ences (Contigs
_CaRS_Mep
GGCCAGGT
ACGGCGGT
TGGCAACG
TGTTCGAC
GGACGTGA
ACGAGCGT
GACCTGAC
1_CaRS_Me
GGGCTGCG
GGTCGACA
TATTGGAA
TGGCGTTG
AGGTTGGT
GGCGCATA
CAGGTACG
3. Fasta file
encing mach
C-CaRs-Mep
CTTCAGGG
GCGCTGGT
CGTGGTCA
CGTCGAAC
CAGCTGTTG
GGATTCCA
GTCGGCCG
4. Assemble
assembled seq
ensus sequenc
Ralstonia sol
Rs-Mep (318)
GACTCGTA
GTGGAATC
GAATACCA
onal training o
ual
embled using
assemble seq
s) are used to
pFLIC_FLIC
TTGAAACAA
TCTGTCGGC
GGCTAACAA
GGCTCGGT
ACCACGGTC
TGACCAGCG
CCTCCCTGA
pFLIC_FLIC
GGTCGCGTT
ATGTTGACG
AGGTCGTCG
GGTTTCGAT
CTGGTCGG
ACGTTGCAG
GAGTTCGCA
format of f
hine
(393)-Contig
GAGGTCAGG
TCACGCTCG
ACGTCCGTG
CAGCTTGTT
GGTATTCC
CAGCCAGT
GTTTGCAGG
ed sequence
quences are co
ce where the a
lanacearum i
)
ACCTGGGCC
CCAACAACG
AACAGCTGG
on Allele Mi

g any one of t
quences are c
determine th
CF_copy.ab1
ACCTGCAA
CAGCCGAC
AGAACATC
TGGCTTCGA
CACCAACG
GCTGCCAA
AAGGCCGCC
CR_copy.ab
TGGCAGCG
GTTGGTGA
GAAGCCAC
TGTTCTTGT
GCTGCCGAC
GGTTGTTTT
ACCGTTGGC
forward and
gs
GTCGGTAT
GTGCCGGTC
GGCTGCGTT
TGCCGTTGT
CTTGTCCAG
TTGGCGCAT
GTACGAGTT
for an allel
ompared with
allelic differen
s furnished be
CAGGTTGAA
GGCGGTCT
GCAACGGC
ining 12
th
- 2

the programm
alled Contigs
he allelic varia
1- Forward S
ACGTATGCG
CCAGACCAA
CGAAACCAA
ACGACCTT
GTCAACATG
ACGCGACCG
CCA
1- Reverse S
GCTGGTCAC
CCGTGGTC
CGAGCCGT
TTAGCCGTT
CAGACCGC
TCAACCTGG
CGTT
d reverse se
CGATCGCG
CAGCGTGC
TCTGGCCA
TAGTTGGC
GGTTGGTCT
TACGTTGC
TCGCAC CG
le, fliC
h the database
nces can be re
elow for illus
AAACAACC
GTCGGCAG
CTAACAAG
25
th
Sept, 201

mes that are a
s (large conti
ations (Fig. 4
Sequence
GCCAACTGG
ACCTGGAC
ACGCCAAC
CCAATATG
GTCGACCTT
GCAGCCCA
Sequence
CGCTCGTGC
CACGTCCGT
TCGAACAG
TGCCAGCTG
CCGTTGTTG
GCCCAGGT
equence read
GGCCTGGGC
CCGAAGGTC
ATATTGGAA
GTTGGTTT
TGGTCGGCT
CAGGTTGTT
GTTGGC
e (www.pamd
esolved. The
stration (Fig.5
CTGCAACG
GCCGACCA
GAACATCGA
11, IISR, Cali

available in p
iguous sequen
).
GCTGTGGA
CAAGGAATA
CTACAACG
GGCCAGAA
TCGGCACG
AGGCCGCGA
CCGGTCAG
TGGCTGCG
GCTTGTTGC
GTTGGTATT
GGATTCCAC
TACGAGTCG
ds obtained
CTGCGGTC
CGACATGT
AGGTCGTCG
TCGATGTTC
TGCCGACA
TTTCAACCT
db.org) for de
consensus se
5).
GTATGCGCC
AGACCAACC
AAACCAAC
icut

public domain
nces). Such a
AATCCA
ACCAAC
GGCAACA
ACGCAGC
GCTGACC
ATCGAT
GCGTGCC
GTTCTGG
CCGTTGT
TCCTTG
CAGCCA
GGCCGT
d from
CGCGTTG
TTGACGTTG
GAAGCCAC
CTTGTTAGC
AGACCGCCG
TGGCCCAG
termination o
equence of all
CAACTG
CTGGAC
CGCCAAC
60
n (eg.
a long
GGT
CCG
CCGT
GTT
GTA
of
lele
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 61

TACAACGGCAACAAGCTGTTCGACGGCTCGGTGGCTTCGACGACCTTCCAATATG
GCCAGAACGCAGCCACGGACGTGACCACGGTCACCAACGTCAACATGTCGACCT
TCGGCACGCTGACCGGCACGAGCGTGACCAGCGCTGCCAACGCGACC
Fig.5. Allele 19 of gene fliC belong to Ralstonia solanacearum infecting Zingiberaceae
members

The string of allele numbers (integers) for the housekeeping and virulence genes obtained for a strain
is called as sequence type which is specific for a strain of bacterium. For example, the allele numbers
obtained for a cardamom strain of Ralstonia solanacearum is ppsA-10, fliC19, hrpB-27, gdhA- 24,
adk-1, gyrB-26, egl-25. The combination of integers (10, 19, 27, 24, 1, 26, and 25) serves the input
data for establishing the strain relationship by eBurst programme which is based on eBurst algorithm,
a dedicated programme for analysis of microbial MLST data.

Phylogenetic analysis using MLST data The allelic sequences, thus, obtained from the strains
are pooled to construct concatenated sequences which serve input data for establishing phylogeny.
The concatenated sequence is nothing but the string of all the loci are assembled in an order (ppsA +
fliC + hrpB + gdhA + adk + gyrB + egl ) to get large sequence length. An example of concatenated
sequence constructed for a strain of Ralstonia solanacearum obtained from cardamom is furnished
below (Table 3, Fig.6). This large sequence length is used in the phylogenetic analysis of bacterium in
question.

>R_solanacearum__CaRs-Mep_ (911) [ppsA + fliC + hrpB + gdhA + adk + gyrB + egl]
GACGAAGACGTGGTCGAGCTGGCCAAGTACGCCGTCATCATCGAGAAGCACTAC
GGTCGCCCGATGGACATCGAGTGGGGTAAGGACGGCAAGGACGGCAAGATCTAC
ATCCTGCAGGCCCGCCCCGAGACGGTGAAGAGCCAGTCGGTCGGCAAGGTCGAG
CAGCGCTTCCGCCTGAAGGGCTCGGCGCCGGTGCTGACCACCGGCCGCGCGATCG
GCCAGAAGATCGGTACGGGCCCCGTGCGCGTGATCAACGATCCGGCCGAAATGG
AGCGCGTGCAGCCGGGCGACGTGCTGGTCGCCGACATGACCGACCCGAACTGGG
AGCCGGTGATGAAGCGCGCCTCGGCCATCGTCACCAACCGTGGCGGCCGCACCT
GTCACGCCGCCATCATCGCGCGTGAGCTGGGCGTGCCGGCCGTGGTCGGCTGCGG
CGACGCCACCGACCTGCTGAAGGACGGCACGCTGGTCACCGTGTCCTGCGCCGA
GGGCGACGAAGGCAAGATCTACGACGGCCTGCTCGAGACGGAAATCACCGAAGTGCGC
CGCGGCGAGATGCCGCCGATCGACGTCAAGATCATGATGAACGTCGGCAA
CCCGCAGCTGGCCTTCGAGTTCGCGCAGATCCCGAACGGCGGCGTGGGCCTGGCC
CGCCTCGAGTTCATCATCAACAACAACATCGGCGTCCACCCGAAGGCGATCCTCG
ACTACCCGCAAGCCGACTCGTACCTGGGCCAGGTTGAAAACAACCTGCAACGTAT
GCGCCAACTGGCTGTGGAATCCAACAACGGCGGTCTGTCGGCAGCCGACCAGAC
CAACCTGGACAAGGAATACCAACAGCTGGCAACGGCTAACAAGAACATCGAAAC
CAACGCCAACTACAACGGCAACAAGCTGTTCGACGGCTCGGTGGCTTCGACGAC
CTTCCAATATGGCCAGAACGCAGCCACGGACGTGACCACGGTCACCAACGTCAA
CATGTCGACCTTCGGCACGCTGACCGGCACGAGCGTGACCAGCGCTGCCAACGC
GACCGTGCTGGCGATGGCCGATGCCTCGCTGCTGCTCGAGTGCGATGAAGAAGC
GGAAGAAGGCTTCCGCCTGGCGCAGCGCCTGATCCGCCATTCGGATGACCAGCTG
CGCGTGGTGTCGTGCCGCAATACCGGCTGGCAGGCACTGCTGCGCGATCGCTACG
CCGCGGCGGCGAGCTGCTTCTCGCGCATGGCCGAAGACGATGGCGCGACCTGGA
CCCAGCAGGTCGAGGGCCTGATCGGCCTGGCGCTGGTGCATCACCAGCTCGGCCA
GCAGGATGCCTCCGACGACGCGCTGCGGGCGGCGCGCGAGGCCGCAGACGGCCG
CAGCGATCGCGGCTGGCTGGCCACCATCGATCTGATCATCTACGAATTCGCCGTG
CAGGCCGGCATCCGCTGCTCCAACCGCCTGCTCGAGCATGCGTTCTGGCAATCGG
CCGAAATGGGCGCGACCCTGCTGGCCAACCACGGCGGCCGCAACGGTTGGACGC
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 62

CGACCGTATCGCAGGGCGTACCGATGCCGGCGCTGATCCAGCGCCGCGCCGAAT
ACCTCAGCCTGCTGCGCCGCATGGCCGACGGGGACCGCGCGGCAATCGACCCGC
TGATGGCGACCCTCAACCACTCGCGCAAGCTCGGCAGCCGCCTGCTGATGCAGAC
CAAGGTGGAAGTCGTGCTGGCCGCGCTGAGCGGCGAGCAGTACGACGTCGCCGG
CCGCGTCTTCGACCAGATCTGCAACCGCGAGACCACCTACCGCGCGCGCCGCTGG
AATTTCGACTTCCTCTACTGCCGCGCCAAGATGGCCGCCCAGCGCGGCGACTCGG
TCAAGAACGCGGCCGTCAACGTGCCGTACGGCGGCGCCAAGGGCGGCGTCCGCG
TCGATCCGCGCAAGCTGTCGTCGGGCGAACTCGAGCGCCTGACCCGCCGCTACAC
CAGCGAGATCGGCATCATCATCGGCCCGAACAAGGACATCCCGGCGCCGGACGT
GAACACCAACGCGCAGATCATGGCGTGGATGATGGACACGTACTCCATGAACGA
AGGCGCCACCGCCACCGGCGTGGTGACCGGCAAGCCGATCGCGCTGGGCGGCAG
CCTGGGCCGCCGCGAGGCGACCGGCCGCGGCGTGTTCGTGGTCGGCAGCGAGGC
TGCACGCAATCTGGGCATCGACGTCAAGGGTGCGCGCATCGTGGTGCAAGGCTTC
GGCAACGTCGGCAGCGTGGCCGCCAAGCTGTTCCAGGATGCCGGCGCCAAGGTG
ATCGCGGTGCAGGACCACAAGGGCATCGTGTTCAACGGCGCGGGCCTGGACGTC
GACGCGCTGATCCAGCACGTGGACCATAACGGCAGCGTCGACGGCTTCAAGGCC
GAGACCCTGTCGGCGGACGATTTCTGGGCGCTGGAATGCGAATTCCTGATCCCGG
CCGCGCTCGAAGGCCAGATCACCGGCAAGAACGCGCCCCAAATCAAGGCAAAAA
TTGTCGTTGAAGGTGCAAACGGCCCCACGACGCCCGAAGCGGACGACATCCTGC
GCGATCGCGGCATCCTGGTCTGCCCGGACGTGATCGCCAATGCCGGCGGCGTCAC
GGTGAGCTATTTCGGCATTCCGCAGATCTCCACCGGCGACATGCTGCGCGCCGCC
GTCAAGGCCGGCACCCCGCTGGGCATCGAAGCCAAGAAGGTGATGGACGCCGGC
GGCCTGGTGTCCGACGACATCATCATCGGCCTGGTGAAGGACCGCCTGCAGCAGT
CCGACTGCAAGAACGGCTACCTGTTCGACGGCTTCCCGCGCACCATCCCCCAGGC
CGAAGCCATGAAGGATGCCGGCGTGCCGATCGACTACGTGCTGGAAATCGACGT
GCCGTTCGACGCCATCATCGAGCGCATGAGCGGCCGCCGCGTGCACGTGGCCTCG
GGCCGGACCTATCACGTCAAGTACAACCCGCCCAAGAACGAGGGCCAGGACGAC
GAAACCGGCGATCCGCTGATCCAGCGCGACGACGACAAGGAAGAAACCCCTGAC
CGGCCTGCGCGCCGCGATGACGCGCGTCATCAACAAGTACATCGCCGACAACGA
GATCGCCAAGAAGGCCAAGGTCGAAACCTCCGGCGACGACATGCGCGAAGGCCT
GACCTGCGTGCTGTCGGTGAAGGTGCCCGAGCCCAAGTTCAGCTCGCAGACCAA
GGACAAGCTCGTTTCGTCCGAAGTGCGCCTGCCGGTGGAAGAAGTCGTGGCCAA
GGCGCTGACGGACTTCCTGCTGGAGACGCCCAACGACGCCAAGATCATCTGCGG
CAAGATCGTTGAAGCCGCGCGTGCCCGCGAAGCCGCCCGCAAGGCCCGCGAGAT
GACGCGCCGCAAGGGCGTGCTCGACGGCATGGGCCTGCCCGGCAAGCTGGCCGA
CTGCCAGGAGAAAGACCCGGCACTGTCCGAACTGTTCATCGTCGAGGGTGACTCCGCAG
GCGGCTCGGCCAAGCAGGGCCGCGACCGCAAGTTCCAGGCGATCCTGCCG
CTCAAGGGCAAGATCCTGAACGTGGAGCGCGCGCGCTTCGACAAGATGCTCTCC
AGCCAGGAAGTGCTCACGCTCATCACCGCCATGGGCACCGGCATCGGCAAGGAC
GACTACAACCTCGACAAGCTGCGCTATCACCGCATCATCATCATGACCGACGCGG
ACGTGGACGGCTCGCACATCCGCACGCTGCTGCTGACGTTCTTCTACCGCCAGAT
GCCCGAGATCATCGAGCGCGGCCACGTGTACATCGCCCAGCCGCCGCTGTACAA
GATCAAGCACGGCAAGGAAGAGCGCTACATCAAGGACGACAACGAGATGGCCG
CGTACCTGATGCGCCAGGCGCTCGACACCGCCATCCTGGTGCGCGCCGACGGCAC
CACCCTCAGTACGGCGGCCGCTACCGACACCACGACCCTGAAGACGGCCGCCAC
CACCTCGATCTCGCCGTTGTGGCTCACCATCGCCAAGGACAGCGCGGCGTTCACG
GTGAGCGGCACGCGCACGGTGCGCTATGGCGCCGGCAGCGCGTGGGTGGCGAAG
AGCATGTCCGGCACAGGCCAGTGCACCGCCGCCTTCTTTGGCAAGGATCCGGCGG
CCGGTGTCGCCAAGGTATGCCAGGTGGCGCAGGGCACGGGCACCCTGCTGTGGC
GCGGCGTCAGCCTGGCCGGCGCCGAGTTCGGGGAGGGCAGCCTGCCGGGCACCT
ACGGGAGCAACTACATCTATCCGTCCGCCGACAGCGCGACCTACTACAAGAACA
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 63

AGGGCATGAACCTCGTGCGCCTGCCGTTCCGCTGGGAGCGGCTGCAGCCCACGCT
CAACCAGGCGCTCGACGCGAACGAGCTGTCGCGCCTGACCGGGTTCGTCAACGC
CGTGACGGCGGCCGGCCAGACGGTGCTGCTCGATCCGCACAACTACGCGCGCTA
CTACGGCAACGTGATCGGCTCGAGCGCGGTGCCCAACAGCGCGTACGCCGATTTC
TGGCGGCGCGTGGCCACCCAGTTCAAGGGCAATGCCCGCGTCATCTTCGGGCTGA
TGAACGAGCCCAATTCGATGCCGACCGAGCAGTGG
Fig.6. Concatenated sequence obtained for a strain of Ralstonia solanacearum

Selected reading
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schffer, Jinghui Zhang, Zheng Zhang, Webb
Miller, and David J. Lipman (1997), "Gapped BLAST and PSIBLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Buddenhagen, I, L Sequeria and A Kelman. 1962. Designation of races of Pseudomonas
solanacearum. Phytopathology 52:726. (Abstract)
Castillo, Jose A., Greenberg, Jean T. 2007 Evolutionary dynamics of Ralstonia solanacearum Appl.
Environ. Microbiol.73: 1225-1238
Nalvo F. Almeida, Shuangchun Yan, Rongman Cai, Christopher R. Clarke, Cindy E. Morris, Norman
W. Schaad, Erin L. Schuenzel, George H. Lacy, Xiaoan Sun, Jeffrey B. Jones, Jose A.
Castillo, Carolee T. Bull, Scotland Leman, David S. Guttman, Joo C. Setubal, and Boris A.
Vinatzer 2010PAMDB, A Multilocus Sequence Typing and Analysis Database and Website
for Plant-Associated Microbes, Phytopathology 100:3, 208-215
Fegan M, Prior P (2005) How complex is the Ralstonia solanacearum species complex? In: Allen
C, Prior P, Hayward AC (eds) Bacterial wilt disease and the Ralstonia solanacearum species
complex. APS Press, St. Paul, pp 449461
Feil EJ, Li BC, Aanensen DM, Hanage WP, Spratt BG. 2004 eBURST: inferring patterns of
evolutionary descent among clusters of related bacterial genotypes from multilocus sequence
typing data. J Bacteriol. Mar;186(5):1518-30
Hayward, AC. 1964. Characteristics of Pseudomonas solanacearum. J. App. Bacteriol. 27:265-277
Hayward AC (1991) Biology and epidemiology of bacterial wilt caused by Pseudomonas
solanacearum. Annu Rev Phytopathol 29:6587
Kumar, A., Sarma, Y. R., and Anandaraj, M. 2004. Evaluation of genetic diversity of Ralstonia
solanacearum causing bacterial wilt of ginger using REP-PCR and PCRRFLP. Curr. Sci.
87:1555-1561.
Prior P, Fegan M (2005) Recent developments in the phylogeny and classification of Ralstonia
solanacearum. Acta Hortic 695:127136
Spratt BG, Hanage WP, Li B, Aanensen DM and Feil EJ. (2004) Displaying the relatedness among
isolates of bacterial species -- the eBURST approach. FEMS Microbiol Lett. Dec
15;241(2):129-34
Wicker E, Grassart L, Coranson-Beaudu R, Mian D, Guilbaud C, Fegan M, Prior P (2007) Ralstonia
solanacearum strains from Martinique (French West Indies) exhibiting a new pathogenic
potential. Appl Environ Microbiol 73:67906801
Yabuuchi E, Kosako Y, Yano I, Hotta H, Nishiuchi Y (1995) Transfer of two Burkholderia and an
Alcaligenes species to Ralstonia gen. Nov.: proposal of Ralstonia pickettii (Ralston, Palleroni
and Doudoroff 1973) comb. Nov., Ralstonia solanacearum (Smith 1896) comb. Nov. and
Ralstonia eutropha (Davis 1969) comb. Nov. Microbiol Immunol 39:897904






National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 64

Rolling circle amplification-RACE (RCA-RACE)
K.Johnson George, and I.P. Vijesh Kumar
Indian Institute of Spices Research, Marikunnu P.O., Calicut- 673012.

Isolation of full-length gene transcripts is important to determine the protein coding region and study
gene structure. However, isolation of novel gene sequences is often limited to expressed sequence
tags (ESTs) (i.e., short cDNA fragments that predominantly represent the 3 end of the transcript).
Rapid amplification of cDNA ends (RACE) is today by far the most popular approach for obtaining
full-length cDNA when only part of the transcripts sequence is known. Since its original description
in 1988 by Frohman et al, numerous modifications and improvements of the method have been
developed and consist of a collection of PCR-based cloning procedures that extend a known cDNA
fragment toward the 3 (3 RACE) or the 5(5 RACE) cDNA end. The original method is based on
attachment of an anchor sequence to one end of the cDNA that can be used as a primer binding
template in PCR with a second gene-specific primer from the known part of the gene.
Alexios et. al. (2003) developed an improved inverse-RACE method, which uses CircLigase
(Epicentre Biotechnologies, Madison, WI, USA) for cDNA circularization, followed by rolling circle
amplification (RCA) of the circular cDNA with 29 DNA polymerase (New England Biolabs,
Ipswich, MA, USA). In this way, a large amount of the PCR template is produced, allowing the
simultaneous isolation of the 3 and 5 unknown ends of a virtually unlimited number of transcripts
after a single reverse transcription reaction. Figure 1 illustrates this method, named RCA-RACE. The
process takes advantage of the properties of CircLigase to circularize single-stranded cDNA
molecules via an intramolecular link. This ATPdependent ligase can circularize singlestranded DNA
(ssDNA) templates that have a 5-phosphate and a 3-hydroxyl group and are longer than 30
nucleotides. According to the manufacturer, under standard reaction conditions, the enzyme makes
essentially no linear or circular concatemers, since it catalyzes only intramolecular ligations. In
addition, although CircLigase is influenced by the ssDNA sequence, high concentrations of the
enzyme can effectively circularize difficult templates (www.epibio.com/pdftechlit/222pl085. pdf).
The circularized cDNA is then amplified in a RCA reaction using the 29 DNA polymerase and
random primers. This would allow the generation of enough template for the cloning of rare
transcripts, as well as high-throughput cloning of cDNA ends for large numbers of genes from scarce
tissue, which cannot be effectively performed with standard RACE methodologies. Variation of the
technique is the famRCA-RACE (Apostolos et al 2010) for amplification of isolating a family of
homologous cDNAs (Fig.1)
FIGURE 1 The family rolling circle amplification rapid amplification of cDNA ends (famRCA-
RACE) method (degenerate primers for isolation of members of a family of homologous genes
present in the mRNA preparation). In step 1, messenger RNA is reverse transcribed into cDNA using
an oligo(dT) primer harboring a 5 phosphorylated adaptor (circle). After RNaseH treatment, the
resulting cDNA in step 2 is circularized using CircLigase. The circular cDNA is then amplified by
RCA using 29 DNA polymerase (gray oval) and random hexamer primers (small squares attached
to gray ovals) to multicopy concatemers. For each transcript family of interest, an aliquot of the RCA
reaction serves as a template in an inverse PCR using degenerate primers to obtain simultaneously the
transcripts 5 and 3 ends (step 3). Degenerate primers are designed outworking (arrows) on
conserved regions (thicker regions on concatemers). An agarose gel with the range of the cloned PCR
products to isolate the genes is presented in step 3.
Labo

Fig. 1
Proto
Plant
length
be fol
1
2
3
4.
5
Natio
ratory Manu
1.
ocol:
Material: Pip
h gene viz., W
llowed.
. Extract to
. Synthesize
adaptor pri
Add 0.5 m
murine leu
to be incub
at 70C for
. Incubate th
using the Q
. Circularize
1X reactio
a total rea
heating at
kit(Qiagen
. Rolling Ci
containing
England B
polymeras
onal training o
ual
per colubrinum
WRKY expect
tal RNA usin
e firststrand cD
imer [5-GGC
mM dNTPs, 10
ukemia virus (
bated at 37C
r 15min.
he reaction at
QIAquick P
e half of the p
on buffer(Epic
action volume
t 80C for 1
n).
ircle Amplifi
g a 15-L ali
Biolabs),1X
e and
on Allele Mi

m challenge i
ted to be upre
ng Spectrum P
DNA using 3
CCACGCGTC
0 mM dithioth
(MMLV) rev
C for 1 h, follo
t 37C for 20
PCR purificati
purified cDNA
centre), 50M
e of 50l and
10 minutes.
ication (RCA
quot of the c
29 DNA p
either 10
ining 12
th
- 2

inoculated wi
egulated durin
PlantTotal RN
3g RNA in a
CGACTAGT
hreitol (DTT)
erse transcrip
owed by heat
min after the
ion kit
A (15 l )usin
M ATP and 1
d incubate at
Purify the m
A): RCA reac
circularized c
polymerase
0M rand
25
th
Sept, 201

ith Phytophth
ng the interac
NA kit(Sigma
a reaction con
TAC(T)18-3]
), 1 first stra
ptase. The rea
inactivation o
addition of 2
ng CircLigas
l 50 mM Mn
60C for 1 h
mixture usin
ctions may b
cDNA, 1 mM
reaction buf
dom hexam
11, IISR, Cali


ora capsici fo
ction. The foll
a-Aldrich)
ntaining 0.5 g
phosphoryla
and buffer, an
action ( total v
of MMLV rev
2 U RNaseH,
e - final con.5
nCl2 ( final c
hour. Inactiva
g QIAquick
be performed
M dNTPs, 20
ffer (NEB),
mers or
icut

or isolating a
lowing steps m
g oligo(dT)-
ated at the 5 e
nd 200 U Mol
volume 50 l)
verse transcri
Purify the rea
5 U/l (Epice
con. 2.5 mM/
ate the enzym
PCR purific
in 50 L vo
00g/ml BSA
10 U 29
1M In
65
full
may
end.
loney
) is
iptase
action
entre),
l) in
me by
cation
olume
A(New
DNA
nVUP
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 66

primer(5GTACTAGTCGACGCGTGGCC-3) both modified by the addition of two
phosphothioate linkages on the 3 end. Incubate at 30C for 21 hour and heat inactivate the
enzyme at 60C for 10 minutes. Verify RCA products by electrophoresis.
6. Inverse-PCR: Perform the Inverse-PCR reaction using using 0.5 L neat or serially diluted
(10-2, 10-4, and 10-6) RCA reaction as template, along with the 0.4M each of gene specific
forward and Reverse primer primers [To be designed based on gene of interest ] and
DyNAzyme II DNA polymerase (Finnzymes, Espoo, Finland). The cycling parameter of
94C for 3 min, followed by 35 cycles of 94C for30 s, 56C for 45 s, 72C for 1.5 min (
depending on the expected size of the product), and a final extension step of 72C for 10 min
are to be followed.
7. Run the PCR products in agarose gel stained with ethidium bromide.
8. Clone the large fragment obtained using TOPO- TA cloning kit (Invitrogen)
9. Screen the clones for the insert and do sequencing.
References:
Alexios N Polidoros, Konstantinos Pasentsis and Athanasios. S. Tsaftaris (2006) Rolling circle
amplification-RACE: a method for simultaneous isolation of 5 and 3 cDNA ends from amplified
cDNA templates. BioTechniques 41:35-42.
Apostolos Kalivas, Konstantinos Pasentsis, Anagnostis Argiriou, Nikos Darzentas and Athanasios S.
Tsaftaris (2010) famRCA-RACE: A Rolling Circle amplification RACE for isolating a family of
homologous cDNAs in one reaction and its application to obtain NAC genes transcription factors
from Crocus (Crocus sativus) flower. Preparative Biochemistry and Biotechnology 40:177-187.
Frohman MA, M K Dush and G R Martin (1988). Rapid production of full-length cDNAs from rare
transcripts: amplification using a single gene-specific oligonucleotideprimer. Proc. Natl. Acad. Sci.
USA 85:8998-9002.














National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 67

Protocols in development and analysis of mutants for functional genomics
Ramesh S. Bhat
Email: bhatramesh12@gmail.com
Department of Biotechnology
University of Agricultural Sciences, Dharwad 580 005
Conventional mutagenesis
Commonly used chemical mutagens are ethylmethane sulfonate, diepoxybutane, N-methyl-
N-nitrosourea and sodium azide. Irradiation mutagens include fast neutron, gamma irradiation and X-
rays and accelerated ions (Bhat et al., 2007).
Raising mutant populations
1. After treatment with an appropriate mutagen, M
1
generation is grown. Progenies of selfed M
1

plants are used to grow M
2
generation.
2. M
2
plants are used to prepare pooled DNA samples for reverse genetics screening, while their
seeds are inventoried.
3. Forward genetics screening (phenotypic analysis) is normally performed on M
3
plants.
4. For assaying quantitative traits, it is particularly important to advance the lines to M
4
or
beyond because of the need to evaluate phenotypes in replicated trials.
5. For the purpose of identifying mutated genes, it is better to aim for a moderate to high
mutation density in the genome so that fewer mutants are needed to achieve genome
coverage.
6. However, too high a dose presents practical problems. At high doses, lethality and sterility of
M
1
plants make it difficult to produce an appropriately large population in a single attempt.
7. Producing a useful mutant population therefore is often a trade-off between the need to
produce high-density mutations and the practicality of keeping a vigorous population without
too many deleterious effects and background mutations.
Forward genetics
Phenotyping
1. Mutant populations harbor a large amount of genetic variability that can be revealed when the
mutants are subjected to appropriate phenotypic screening.
2. Morphological mutants can be identified based on phenotypic categories.
3. Conditional mutants are studied with appropriate experimental conditions.
Map-based cloning
DNA sequence responsible for the trait is identified by walking down the chromosome
using genetic markers. Availability of the genome sequence will hasten gene identification
considerably. It can be used to identify any gene, given an adequate map of the region of the
chromosome in which it is located. The gene is first mapped to a specific region of a given
chromosome by genetic crosses. The gene is next localized on the physical map (in a library) of this
region of the chromosome. Candidate genes in the segment of the chromosome identified by physical
mapping are then isolated from mutant and wild type individuals and sequenced to identify mutations
that would result in a loss of gene function. An example of map-based cloning is the cloning of a
fertility restorer gene, Rf-1, in rice (Komori et al., 2004).
Detecting genomic changes using genome-wide chips
Single-feature polymorphisms (SFP) are detected using oligonucleotide (oligo) chips
containing 24-mer oligos representing genes to detect deletions (Borevitz et al., 2003; Chang et al.,
2003; Wang et al., 2004). Genes/probes that generate hybridization signals below those of the wild-
type cultivar (based on significant t-test) are considered as candidate genes. Genome coverage of the
oligoarray chip can be increased with newer versions of oligoarrays such as the 44K Agilent
oligoarray genome chips such as the 51K Affymetric GeneChip. Large deletions or multiple
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 68

mutations across the genome can be overcome by pooling of DNA from segregants with common
phenotypes. This also masks irrelevant mutations.
Differential cDNA screening
Useful for the identification of differentially expressed cDNAs. A recent resurgence in the
popularity of differential screening has come about through the development of DNA microarrays
(Meldrum, 2000). Alternatively, subtracted cDNA library is generated by enriching with differentially
expressed clones and by removing sequences that are common in two sources.
Arbitrarily primed PCR uses pairs of short arbitrary primers to amplify pools of partial cDNA
sequences. If the same primer combinations are used to amplify cDNAs from two different sources,
the products can be fractionated side by side on a sequencing gel, and differences in the pattern of
bands is generated, and reveal differentially expressed genes. In differential display PCR technique
(Liang and Pardee, 1992), the antisense primer is an oligo-dT primer with a specific two base
extension, which thus binds at the 3 end of the mRNA. Conversely, in the arbitrarily primed PCR
(Welsh et al., 1992), the antisense primer is arbitrary and can in principle anneal anywhere in the
mRNA. In both these methods, an arbitrary sense primer is used, allowing the amplification of partial
cDNAs from pools of several hundred mRNA molecules. Following electrophoresis, differentially
expressed cDNAs can be excised from the gel and characterized further.
In PCR subtraction method (Lisitsyn and Wigler, 1993), common sequences between two
sources are eliminated prior to amplification. cDNA from the two sources are prepared, digested with
restriction enzyme and amplified. The amplified products from one source (tester) are then annealed
to specific linkers that provide annealing sites for a unique pair of primers. These linkers are not
added to the driver cDNA. A large excess of driver cDNA is then added to the tester cDNA and the
populations are mixed. Driver/driver fragments posses no linkers and cannot be amplified, while
driver/tester fragments possess only one primer annealing site and will only be amplified in a linear
fashion. However, cDNAs that are present only in the tester will possess linkers on both strands and
will be amplified exponentially, and can therefore be isolated and cloned.
Global gene expression profiling can be made with large scale sequencing of random clones
from cDNA libraries. Further improvement in expression profiling has been made with serial analysis
of gene expression (SAGE) (Velculescu et al., 1995) and massively parallel signature sequencing
(MPSS) (Brenner et al., 2000).
Reverse genetics
PCR screening
Small to medium-sized deletions in genomes are detected through PCR analysis (Jansen et
al., 1997). This method identifies smaller than expected amplicons due to the presence of a deletion
(Li et al., 2001; Li and Zhang, 2002). Primers flanking a genomic region containing a target gene are
designed in such a way that the product generated by the wild-type allele is difficult to PCR amplify
because of its large size. When a deletion reduces the length of the region flanked by the primers, the
fragment with such deletion can often be amplified with higher efficiency. As a result, such smaller
product can be detected even if the DNA from the individual allele carrying the deletion is mixed
with DNA from many wild-type individuals.
TILLING
Targeting Induced Local Lesions in Genomes (TILLING) is a high-throughput reverse-
genetic technique for gene identification (Bhat et al., 2007). It is employed to discover point
mutations in the mutant libraries created via traditional chemical mutagenesis. TILLING approach
makes use of DNA strand mismatches formed between mutant and wild-type DNA. DNA from
individual M2 plants is isolated, pooled and arrayed in 96-well plates. Primers are designed to bracket
a 1-kb region that most likely contains a deleterious mutation in a target gene. The primers are then
used to amplify the gene of interest followed by denaturing and reannealing of DNA to allow
formation of homo- and heteroduplexes in the DNA pool. Originally, denatured high-performance
liquid chromatography (HPLC) was used to detect the presence of a DNA mismatch, but now it is
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 69

detected by enzymatic cleavage of PCR-amplified heteroduplexed DNA (Xin et al., 2008) and band
visualization using fluorescent endlabeling and denaturing polyacrylamide gel electrophoresis. A
modified procedure of TILLING, called EcoTILLING was applied to identify natural allelic variants
(Comai et al., 2004).
Gene tagging and trapping
Most popular mutagenesis strategy in functional genomics, where a piece of known DNA
(tag) is randomly inserted into the genome to have loss-of-function or gain-of-function of the tagged
gene (Guiderdoni et al., 2007; Johnson et al., 2007; Zhu et al., 2007; Upadhyaya et al., 2010). These
tags can be modified into traps by recombining reporters to gain additional information on the
expression pattern of the gene.
Insertional inactivation tagging with T-DNA
1. The construct may have gene trap feature (uni-directional or bidirectional), selection
markers/reporters and plasmid rescue cassette (Guiderdoni et al., 2007; Upadhyaya et al.,
2010).
2. Take up high throughput transformation, select the transgenics and confirm.
3. Check the copy number of T-DNA, select those with single copy.
4. Look for gene trap based on reporter gene expression.
5. Look for novel/mutant phenotype.
6. Isolate the flanking sequence tag using plasmid rescue or TAIL-PCR (Liu et al., 1995).
7. Identify the tagged gene.
8. Study the co-segregation between the tagged gene and mutant phenotype.
9. Validate gene function by complementation, RNAi etc.
Activation tagging with T-DNA
1. The construct shall have enhancer or strong promoter at one or both the ends of T-DNA along
with selection markers/reporters and plasmid rescue cassette (Johnson et al., 2007).
2. Take up high throughput transformation, select the transgenics and confirm.
3. Check the copy number of T-DNA, select those with single copy.
4. Look for novel/mutant phenotype.
5. Isolate the flanking sequence tag using plasmid rescue or TAIL-PCR. Identify the tagged
gene.
6. Study the co-segregation between the tagged gene and mutant phenotype.
7. Validate gene function by over-expression.
Insertional inactivation tagging with transposable element
1. The Ds construct may have gene trap feature (uni-directional or bidirectional), T-DNA
selection markers/reporters, Ds tracer, Ds excision marker, plasmid rescue cassette, T-DNA
gene trap counter selector, T-DNA repeat counter selector (Upadhyaya et al., 2006; Zhu et
al., 2007).
2. Ac construct can have T-DNA selection markers/reporters, transposase coding region, Ac
reporter.
3. Develop independent T-DNA/Ds lines and Ac lines by high throughput transformation. Select
Ds lines with single copy T-DNA/Ds and not showing T-DNA gene trap and repeats.
4. Alternatively develop double transformants by co-transformation (with Ds and Ac constructs)
or super-transformation. Select lines with single copy Ds.
5. Ds tagged mutants can be developed by
a. Independent Ds and Ac lines are crossed. F
1
s are mutagenic (contain both Ds and Ac).
Stable Ds tagged mutants are identified by screening F
2
s by Ds excision marker and
Ds tracer. Ac reporter is used to make sure that such plants are free from Ac.
b. Double transformants (T
1
) are mutagenic. Mutants are identified in T
2
and T
3
.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 70

c. Transient expression of transposase (TET) can be taken up by co-cultivating the calli
derived from plants showing single copy T-DNA/Ds with Ac construct. Mutants are
identified in T
1
and T
2
.
6. In the mutant, look for gene trap based on reporter gene expression.
7. Look for novel/mutant phenotype.
8. Isolate the flanking sequence tag using plasmid rescue or TAIL-PCR. Identify the tagged
gene.
9. Study the co-segregation between the tagged gene and mutant phenotype.
10. Validate gene function by complementation, RNAi, Ds reversion etc.
Activation tagging with transposable element
1. The Ds construct may have enhancer or strong promoter at one or both the ends of Ds, T-
DNA selection markers/reporters, Ds tracer, Ds excision marker, plasmid rescue cassette, T-
DNA gene trap counter selector, T-DNA repeat counter selector (Johnson et al., 2007).
2. Ac construct can have T-DNA selection markers/reporters, transposase coding region, Ac
reporter.
3. Develop independent T-DNA/Ds lines and Ac lines by high throughput transformation. Select
Ds lines with single copy T-DNA/Ds and not showing T-DNA gene trap and repeats.
4. Alternatively develop double transformants by co-transformation (with Ds and Ac constructs)
or super-transformation. Select lines with single copy Ds.
5. Ds tagged mutants can be developed by
a. Independent Ds and Ac lines are crossed. F
1
s are mutagenic (contain both Ds and Ac).
Stable Ds tagged mutants are identified by screening F
2
s by Ds excision marker and
Ds tracer. Ac reporter is used to make sure that such plants are free from Ac.
b. Double transformants (T
1
) are mutagenic. Mutants are identified in T
2
and T
3
.
c. Transient expression of transposase (TET) can be taken up by co-cultivating the calli
derived from plants with single copy T-DNA/Ds with Ac construct. Mutants are
identified in T
1
and T
2
.
6. In the mutant, look for gene trap based on reporter gene expression.
7. Look for novel/mutant phenotype.
8. Isolate the flanking sequence tag using plasmid rescue or TAIL-PCR. Identify the tagged
gene.
9. Study the co-segregation between the tagged gene and mutant phenotype.
10. Validate gene function by Ds reversion and over-expression etc.
Reverse genetics with tagged mutants
In the reverse genetics approach, one starts with a computer predicted gene from the genome
sequence and searches for an insertion mutant in that gene. Oligonucleotide primers from the
insertional element and from the gene of interest are used for PCR amplification. Appropriately
pooled DNA samples are used for high throughput screening for this often rare event in such
populations. Once a mutation in the gene of interest has been identified homozygotes are isolated and
the phenotype confirmed.
Trans-activation
Used to activate a gene in a target tissue and cell type, where it is usually active. Trans-
activation system makes use of enhancer trapping (Johnson et al., 2005; Johnson et al., 2007)
efficiency of yeast GAL4 transcriptional activator.
1. GAL4 construct will have a minimal promoter-driven Gal4 and a reporter gene driven by
upstream activating sequence (UAS) within T-DNA containing a selection marker.
2. Driver lines expressing GAL4 are produced by high throughput transformation. Tissue and
cell specific expression of GAL4 is determined by reporter gene expression.
3. Endogenous responder lines are produced by high throughput transformation using T-DNA
carrying single or multiple UAS elements and along with selection markers.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 71

4. Cell or tissue specific activation tagged mutants are generated by crossing endogenous
responders with selected GAL4 enhancer traps (driver lines)
5. The effect of gene activation in a mutant is studied for assigning the gene function.
RNA silencing
RNAi provides a new, reliable reverse genetic method to investigate gene function (Curtin et
al., 2007). Several platforms have been developed for delivering gene silencing in plants (Watson et
al., 2005; Curtin et al., 2007). Sense strand transgene expresses the same mRNA as that of the target
gene produces, whereas antisense transgenes express RNA complementary to target mRNA. In
amplicon transgenes, the cDNA of a virus, driven by a constitutive promoter (such as CaMV 35S), is
recombined with a target gene of interest. Hairpin RNA (hpRNA) is expressed from an inverted
repeat construct consisting of a promoter, a targeted sense sequence, a spacer region, a
complementary targeted antisense sequence, and a transcription terminator. The use of an intron
instead of a nonspecific DNA spacer has been shown to increase the silencing efficiency of these
hairpins. In direct repeat induced PTGS (driPTGS), RNA encoding multiple-copy direct repeats of the
target gene is expressed. Silencing efficiency of direct-repeat transgenes can be further improved by
increasing the number of repeats to three or four rather. To overcome the problem of multiple cloning
steps, Gateway system can be used. In another strategy called constructs with 3 inverted repeat, a
fragment of the target gene is fused at the 3 end with an inverted repeat arrangement of a nontarget
sequence. A transgene strategy that is similar to SHUTR uses a transgene of which the 5 end
contains both an inverted repeat and a direct repeat of the 5-UTR of an ethylene biosynthetic gene,
followed by the coding sequence of the target gene.
Artificial miRNAs are designed to target a specific gene, or a group of related genes using
naturally occurring miRNA precursor sequences as a backbone. Original miRNA sequence and its
complementary strand (miRNA*) are substituted with amiRNA and amiRNA* sequences,
respectively. Initial stem-loop structures of the natural miRNA precursor are well maintained, and
that the composition of the amiRNA sequences closely imitate those of the natural miRNAs. Other
important parameters include a preference for uridine at the 5 terminus and an adenine at the tenth
base of the amiRNA as these nucleotides are highly conserved in natural miRNA populations as well
as in highly efficient siRNAs. A mismatch corresponding to the 5 end of the amiRNA sequence in
the amiRNA/amiRNA* molecule, is included to increase the likelihood that the amiRNA strand is
preferentially incorporated into the RISC complex. To avoid the possibility of transitive RNA
silencing, triggered by a perfectly matching amiRNA hybridizing to the target and acting as a primer
for RDR6, one to three mismatches are incorporated into the 3 end of the amiRNA.
Virus-Induced Gene Silencing
It is a transient PTGS of plant genes by recombinant viruses carrying a near-identical
sequence (Baulcombe, 1999; Burch-Smith et al., 2004). A 300- to 800-nt exogenous sequence is
inserted into a specific location within the cDNA of PVX or TRV without the loss of infectivity of the
RNA transcript. Recombinant virus is allowed to infect the plant. Viral infections can be established
with naked viral RNA without the presence of coat proteins. In vitro-synthesized RNA transcripts,
from a plasmid containing a cDNA encoding a complete virus genome, can also initiate virus
infections. Also, the viral cDNA can be cloned into the T-DNA of Agrobacterium, which is delivered
to a plant by agroinfiltration and expressed by the CaMV 35S promoter to initiate infections.
Microprojectile bombardment can also be used for introducing DNA, and sometimes RNA, into cells.
Satellite virus-induced silencing system (SVISS) uses the vectors derived from RNA and DNA
satellite viruses. The target sequence is inserted into the satellite and coinoculated with its respective
helper virus, either TMV or tomato yellow leaf curl (TYLCV) geminivirus.
Gene targeting
Targeted mutagenesis is a powerful revesre genetic tool for generating specific and precise
DNA sequence alterations that enable a greater understanding of gene function (Iida et al., 2007).
Recently, zinc finger nucleases (ZFNs), which are the chimeric proteins composed of a synthetic zinc
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 72

fingerbased DNA binding domain and a DNA cleavage domain are being used to create double
strand breaks at specific sites. Such cleavages are then repaired by error-prone non-homologous end
joining (NHEJ). Hence this mode can be successfully used for gene targeting (Lloyd et al., 2005;
Osakabe et al., 2010; Zhang et al., 2010).
FOX hunting
FOX hunting system (full-length cDNA overexpressor gene hunting system) is an alternative
method of producing gain-of-function mutants for the ectopic expression of plant genes (Ichikawa et
al., 2006). A gene is over-expressed using DNA or cDNA of its own or from other system. Generally
it is done with constitutive promoter leading to ectopic expression. The mutants developed are called
gain-of-function mutants. In FOX hunting system, each cDNA from a normalized full-length cDNA
is introduced into plant, and the transgenic plant is observed for mutant phenotype (Nakamura et al.,
2007; Kondou et al., 2009).
References:
Baulcombe, D. C., 1999, Curr. Opin. Plant Biol., 2 (2): 109-113.
Bhat, R. S., et al., 2007. In: Upadhyaya N. M., ed. New York, USA. Springer Life Sciences, pp. 149-
180.
Borevitz, J. O., et al., 2003, Genome Res., 13 (3): 513-523.
Brenner, S., et al., 2000, Nat. Biotechnol., 18 (6): 630-634.
Burch-Smith, T. M., et al., 2004, Plant J., 39 (5): 734-746.
Comai, L., et al., 2004, Plant J., 37 (5): 778-786.
Curtin, S. J., et al., 2007. In: Upadhyaya N. M., ed. New York. Springer Life Sciences, pp. 291-332.
Guiderdoni, E., et al., 2007. In: Upadhyaya N. M., ed. New York. Springer Life Sciences, pp. 181-
222.
Ichikawa, T., et al., 2006, Plant J., 48 (6): 974-985.
Iida, S., et al., 2007. In: Upadhyaya N. M., ed. New York. Springer Life Sciences, pp. 273-289.
Jansen, G., et al., 1997, Nat. Genet., 17 (1): 119-121.
Johnson, A. A. T., et al., 2005, Plant J., 41 (5): 779-789.
Johnson, A. A. T., et al., 2007. In: Upadhyaya N. M., ed. New York. Springer Life Sciences, pp. 333-
353.
Komori, T., et al., 2004, Plant J., 37 (3): 315-325.
Kondou, Y., et al., 2009, Plant J., 57 (5): 883-894.
Li, X., et al., 2001, Plant J., 27 (3): 235-242.
Li, X., et al., 2002, Funct Integr Genomics, 2 (6): 254-258.
Liang, P., et al., 1992, Science, 257 (5072): 967.
Lisitsyn, N., et al., 1993, Science, 259 (5097): 946.
Liu, Y. G., et al., 1995, Plant J., 8 (3): 457-463.
Lloyd, A., et al., 2005, Proc. Natl. Acad. Sci. U.S.A., 102 (6): 2232-2237.
Meldrum, D., 2000, Genome Res., 10 (9): 1288.
Nakamura, H., et al., 2007, Plant Mol. Biol., 65 (4): 357-371.
Osakabe, K., et al., 2010, Proc. Natl. Acad. Sci. U.S.A., 107 (26): 12034-12039.
Upadhyaya, N. M., et al., 2010. In: Pereira A., ed. New York. Springer Life Sciences, pp. 147-177.
Upadhyaya, N. M., et al., 2006, Theor. Appl. Genet., 112 (7): 1326-1341.
Velculescu, V. E., et al., 1995, Science, 270 (5235): 484.
Wang, G. L., et al., 2004, Theor. Appl. Genet., 108 (3): 379-384.
Watson, J. M., et al., 2005, FEBS Lett., 579 (26): 5982-5987.
Welsh, J., et al., 1992, Nucleic Acids Res., 20 (19): 4965.
Xin, Z., et al., 2008, BMC Plant Biol., 8 (1): 103.
Zhang, F., et al., 2010, Proc. Natl. Acad. Sci. U.S.A., 107 (26): 12028-12033.
Zhu, Q. H., et al., 2007. In: Upadhyaya N. M., ed. New York. Springer, pp. 223271.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 73

Quantitative RT-PCR
Prasath D., Johnson K. George, Vijesh Kumar I.P.
Indian Institute of Spices Research, Calicut673012,Kerala
Introduction

The real-time polymerase chain reaction uses uorescent reporter dyes to combine DNA
amplication and detection steps in a single tube format. The increase in uorescent signal recorded
during the assay is proportional to the amount of DNA synthesised during each amplification cycle.
Individual reactions are characterised by the cycle fraction at which uorescence rst rises above a
dened background uorescence, a parameter known as the threshold cycle (Ct) or crossing point
(Cp). Consequently, the lower the Ct, the more abundant the initial target. This correlation permits
accurate quantication of target molecules over a wide dynamic range, while retaining the sensitivity
and specicity of conventional end-point PCR assays. The homogeneous format eliminates the need
for postamplication manipulation and signicantly reduces hands-on time and the risk of
contamination. Real-time PCR is often abbreviated to qPCR, although that abbreviation is not
universally accepted.
Real-Time chemistries allow for the detection of PCR amplification during the early phases
of the reaction. Measuring the kinetics of the reaction in the early phases of PCR provides a distinct
advantage over traditional PCR detection. Traditional methods use Agarose gels for detection of PCR
amplification at the final phase or end-point of the PCR reaction.

There are three main chemistries in general use:
Intercalating dyes, such as SYBR-Green, which fluoresce upon light excitation when bound to
double stranded DNA. These are cheap, easily added to legacy assays and amplification products can
be verified by the use of melt curves. They can lack specificity and fluorescence varies with amplicon
length. In general, they are one Ct or so more sensitive than probe-based assays.
Fluorophores attached to primers, e.g. Invitrogen's Lux or Promega's Plexor primers. These are
relatively inexpensive and amplification products can be verified by melt curves. Specificity depends
on the primers and specific, usually company-specific design software needs to be used for optimal
performance. This is not necessarily a bad thing (indeed the Plexor software is very useful), but it is
not always possible to change primer design parameters.
Hybridisation-probe based methods, e.g. hydrolysis (TaqMan) or Molecular Beacons. These are
the most specific, as products are only detected if the probes hybridise to the appropriate
amplification products. There are many variations on this theme, with melt curve analysis possible for
some chemistries. Their main disadvantages are cost, complexity and occasional fragility of probe
synthesis. There are potential problems associated with the fact that probe-based assays do not report
primer dimers that can interfere with the efficiency of the amplification reaction.
The 5 Nuclease Assay In the 5 nuclease assay, an oligonucleotide called a TaqMan Probe is
added to the PCR reagent master mix. The probe is designed to anneal to a specific sequence of
template between the forward and reverse primers. The probe sits in the path of the enzyme as it starts
to copy DNA or cDNA. When the enzyme reaches the annealed probe the 5 exonuclease activity of
the enzyme cleaves the probe.
SYBR Green Dye SYBR Green chemistry is an alternate method used to perform real-time PCR
analysis. SYBR Green is a dye that binds the Minor Groove of double stranded DNA. When SYBR
Green dye binds to double stranded DNA, the intensity of fluorescent emissions increases. As more
double stranded amplicons are produced, SYBR Green dye signal will increase. SYBR Green dye
will bind to any double stranded DNA molecule, while the 5 Nuclease assay is specific to a pre-
determined target. The increase in reporter signal is captured by the Sequence Detection instrument
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 74

and displayed by the software. The Figure below shows an increase in the reporter signal over time.
The amount of reporter signal increase is proportional to the amount of product being produced for a
given sample. When the fluorescent signal Reporter increases to a detectable level it can be captured
and displayed as an Amplification Plot, The Amplification Plot contains valuable information for the
quantitative measurement of DNA or RNA. The Threshold line is the level of detection or the point at
which a reaction reaches a fluorescent intensity above background. The threshold line is set in the
exponential phase of the amplification for the most accurate reading. The cycle at which the sample
reaches this level is called the Cycle Threshold, Ct. These two values are very important for data
analysis using the 5 nuclease assay.

Protocol
Reaction Set up
1. Gently vortex and briefly centrifuge all solutions after thawing.
2. Prepare a reaction master mix by adding the following components(except template DNA)
3. Usually the total reaction volume is 25l, prepare reaction as follows :

Reagents Con. required volume
SYBR Green Qiagen master
Mix(2X)
2X 12.5l

Foraward primer 10pM 1l
Reverse Primer 10pM 1l
Template cDNA (diluted cDNA) 5l
Nuclease free water - 9.5l
Total = 25 l

Reaction Conditions

95C 5 min - 1 cycle
95C 30 sec
58C 30 sec 35 cycles
72C 30 sec
Hold @ 4
o
C









Labo

ampli
RNA.
incub
produ
pyrop
equip
DNA
comp
poten
low c
sensit
two o
specif
highe
succe
al., 20
al., 20

Desig

FIP (
the F2
F3: F
BIP (
the B2
B3: B
FL (F
and F
BL (
B1 an



Natio
ratory Manu
L

Loop med
ification of D
. LAMP is a
ation thereby
uct can be
phosphate in
ment (Notom
amplificatio
lexation of m
ntial to be use
cost, LAMP c
tivity. In LAM
or three sets
fic nature of t
r than PCR
ssfully used f
000; Mori et a
010) .
gning prime
(Forward I
2c region, and
Forward Oute
(Backward
2c region, and
Backward Out
Forward Lo
2 regions on
Backward
nd B2 regions
onal training o
ual
Loop medi
Indian
diated isothe
NA. It may b
a novel appro
y obviating t
by photome
solution or w
mi et al., 2000
on is possibl
manganese b
ed as a simple
could provide
MP, the target
of primers an
the action of t
based ampli
for the detect
al., 2001; Nag
ers
nner Prime
d the same seq
r Primer cons
Inner Prim
d the same se
ter Primer con
oop): sequen
the 5' end of
Loop): sequ
s on the 5' end
on Allele Mi

iated isoth
n Institute of S
ermal amplif
be combined
oach to nucl
the need for
etry for turb
with addition
0; Mori et al.,
le using man
by pyrophosp
e screening a
e major adva
t sequence is
nd a polyme
these primers
ification. Hen
tion several fu
gamine et al.,
er): consists
quence as the
sists of the F3
mer): consists
equence as the
nsists of the B
nces complem
the dumbbell
uences compl
d of the dumb
ining 12
th
- 2

hermal am
A.I.Bh
Spices Resear
fication (LAM
with a revers
eic acid amp
expensive th
bidity caused
n of SYBR g
, 2001; Nagam
nganese load
phate during
assay in the fi
antages comp
amplified at
rase with hig
s, the amount
nce, LAMP
fungal, bacter
2002; Fukuta
of the F2 reg
e F1c region a
3 region that i
s of the B2 re
e B1c region
B3 region that
mentary to the
l-like structur
ementary to t
bbell-like struc
25
th
Sept, 201

mplificatio
hat
rch, Calicut 6
MP) is a s
se transcriptio
plification wh
hermal cycler
d by increa
green, a colo
mine et al., 2
ded calcein w
in vitro DN
field. Due to
pared to PCR
a constant tem
gh strand dis
of DNA prod
can also be
rial and viral
a et al., 2003;
gion (at the 3
at the 5' end.
is complemen
egion (at the 3
at the 5' end
t is compleme
e single strand
re
the single str
cture
11, IISR, Cali

n (LAMP
73 012, Kera
single tube t
on step to allo
hich uses a s
rs. Detection
asing quantit
or change can
2002). Also in
which starts
NA synthesis
its simplicity
R without com
mperature of
splacement ac
duced in LAM
quantitative.
pathogens of
; 2004; Nie, 2
' end) that is
ntary to the F3
3' end) that is
entary to the
ded loop regi
randed loop r
icut

)
la
technique fo
ow the detecti
single tempe
n of amplific
ty of magne
n be seen w
n-tube detecti
fluorescing
s. LAMP ha
y, ruggedness
mpromising o
f 65 C using
ctivity. Due t
MP is conside
LAMP has
f plants (Noto
2005; Tomlin
complement
3c region
complement
B3c region
ion between t
region betwee
75
or the
ion of
rature
cation
esium
without
ion of
upon
as the
s, and
on the
either
to the
erably
been
omi et
son et
ary to
tary to
the F1
en the

National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 76

General considerations for primer design:
The distance between 5' end of F2 and B2 is considered to be 120-180bp, and the distance between
F2 and F3 as well as B2 and B3 is 0-20bp; The distance for loop forming regions (5' of F2 to 3' of F1,
5' of B2 to 3' of B1) is 40-60bp; About 50-60% in the case of GC rich and Normal, about 40-50% for
AT rich; Primers should be designed so as not to easily form secondary structures. 3' end sequence
should not be AT rich or complementary to other primers; If the restriction enzyme sites exist on the
target sequence, except the primer regions, they can be used to confirm the amplified products.

Performing LAMP
General procedure for performing LAMP reaction includes isolation of nucleic acid, amplification
and detection. In order to perform amplification, six primers (FIP, F3, BIP, B3; F- Loop and B-Loop),
DNA polymerase with strand displacement activity, substrates (deoxynucleotide triphosphate), and
the reaction buffer are required. The procedure simply consists of incubating the template sample and
the above reagents at a constant temperature between 60-65C for 15 minutes to 1 hour. The presence
of amplified product can be detected in a short time so as to provide a simple and rapid gene
amplification method. Both simple detection and real-time detection of the reaction are possible.
Various detection methods include:

Visual methods:
- The turbidity of magnesium pyrophosphate, a by-product of the amplification reaction, is
produced in proportion to the amount of amplified products. Since LAMP amplification can
produce extremely large amount of amplified products, white turbidity can be visually
observed. From this feature, the presence of turbidity can indicate the presence of target gene
and visual detection can be achieved
- If the tube containing the amplified products in the presence of fluorescent intercalating dye
(ethidium bromide, etc.) is illuminated with a UV lamp, the fluorescence intensity increases.
From this feature, the presence of fluorescence can indicate the presence of target gene and
visual detection can be achieved.

Detection by electrophoresis
- LAMP products are run on a 2% agarose gel. Electrophoresis pattern of LAMP amplified
product is not a single band but a ladder pattern because LAMP method can form amplified
products of various sizes consisting of alternately inverted repeats of the target sequence on
the same strand.


Procedure
DNA isolation
Samples used: Piper yellow mottle virus infected black pepper. The procedure is as follows:
1. Grind 100 mg of leaf tissue in 500 l extraction buffer (100mM Tris Hcl (pH8.0), 4mM
EDTA,1.4 mM NaCl, 2% CTAB, 1% PVP,0.5% -Mercaptoethanol) using chilled mortar
and pestle and collect the filtrate in an Eppendorff tube.
2. Incubate in a water bath at 65
o
C for 30 min.
3. The homogenate is allowed to cool to room temperature and add equal volume of
Phenol:Chlorofom:Isoamylalcohol (25:24:1) and mix well.
4. Centrifuge at 2500g for 10 min at room temperature.
5. Collect the supernatant in a new tube and add 0.1 V of 10% CTAB, equal volume
Chloroform:isoamylalcohol (24:1)and mix well.
6. Centrifuge at 2500g for 10 min at room temperature.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 77

7. Collect the supernatant in a new tube and add 0.1V of 3M sodium acetate (pH 5.2) and add
equal volume of ice-cold isopropanol.
8. Mix well and incubate in ice for 30min.
9. Centrifuge the mixture at 10,000 rpm for 15 min at 4
o
C.
10. Discard the supernatant. Add about 500 l of 70% ethanol to the pellet and centrifuge for 5
min at 12,000 rpm.
11. Discard the supernatant and air dry the pellet.
12. Dissolve the pellet in 100 l of HPLC grade water and store the DNA at -20
o
C.

LAMP reaction mix
Thermopol buffer (10x) 2.5 l
MgSO
4
(50 mM/l) 4.0 l
dNTP mix (10 mM/l) 3.5 l
F3 Primer (10 M/l) 0.5 l
B3 primer (10 M/l) 0.5 l
FIP Primer (100 M/l) 0.5 l
BIP primer (100 M/l) 0.5 l
F-Loop primer (100 M/l) 0.25 l
B-Loop primer (100 M/l) 0.25 l
Betaine (5M) 5.0 l
Bst Polymerase (8U/ l) 1.0 l
Water 5.5 l
Template 1.0 l
Total 25.0 l

Incubate the above reaction mix at 65 C for 60 min followed by 80 C for 10 min
Products (10 l) run on 1.2% agarose gel at 130 V for 45 min along with a marker. Positive reaction
identified by presence of multiple bands of different sizes.

Selected references
Fukuta, S., Iida, T., Mizumkami, Y., Ishida, A., Ueda, J., Kanbe, M., and Ishimoto, Y. 2003.
Detection of Japanese yam mosaic virus by RT-LAMP.Arch. Virol. 148:1713-1720.
Fukuta, S., Ohishi, K., Yoshida, K., Mizukami, Y., Ishida, A., and Kanbe,M. 2004. Development of
immunocapture reverse transcription loopmediated isothermal amplification for the detection
of Tomato spotted wilt virus from chrysanthemum. J. Virol. Methods 121:49-55.
Mori, Y., Nagamine, K., Tomita, N., and Notomi, T. 2001. Detection of loop-mediated isothermal
amplification reaction by turbidity derived from magnesium pyrophosphate formation.
Biochem. Biophys. Res. Commun. 289:150-154.
Yasuyoshi Mori, Masataka Kitao, et al. 2004.Real-time turbidimetry of LAMP reaction for
quantifying template DNA. Journal of Biochemical and Biophysical Methods, Vol.59 145-
157.
Nagamine, K., Hase, T., and Notomi, T. 2002. Accelerated reaction by loop-mediated isothermal
amplification using loop primers. Mol. Cell.Probes 16:223-229.
Nie, X. 2005. Reverse transcription loop-mediated isothermal amplification of DNA for detection of
Potato virus Y. Plant Dis. 89:605-610
Notomi, T., Okayama, H., Masubuchi, H., Yonekawa, T., Watanabe, K., Amino, N., and Hase, T.
2000. Loop-mediated isothermal amplification of DNA. Nucleic Acids Res. 28:e63.
Tomlinson, J.A., Dickinson, M.J. and Boonham, N. 2010. Rapid detection of Phytophthora ramorum
and P. kernoviae by two minute DNA extraction followed by isothermal amplification and
amplicon detection by generic lateral flow device. Phytopathology, 100: 143-149.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 78


Two Dimensional Gel Electrophoresis
R. Viswanathan,
Sugarcane Breeding Institute, Coimbatore 641007
P. R. Rahul,
Division of Crop Improvement, IISR, Calicut 673012
Introduction
Two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) is a method of protein
separation, by which proteins in a mixture are separated according to their isoelectric point (pI) in the
horizontal direction (isoelectric focusing [IEF]) and molecular weight in the vertical direction
(sodium dodecyl sulfate polyacrylamide gel electrophoresis [SDS-PAGE]). 2D-PAGE is used for the
isolation/separation/purification of proteins and further characterization with mass spectrometry and
identification of specific proteins. The isoforms of a protein can easily be isolated with 2D-PAGE.
2.1. Sample Preparation
Appropriate sample preparation is absolutely essential for excellent 2-D results. In general, it
is advisable to keep sample preparation as simple as possible. A sample with low protein and high salt
concentration, for example, could be diluted normally and analyzed or desalted, then concentrated by
lyophilization or precipitated with TCA and ice-cold acetone, then re-solubilized with rehydration
solution. The composition of sample solution is particularly critical for 2-D because solubilization
treatments for the first-dimension separation must not affect the protein pI, nor leave the sample in a
highly conductive solution. The protocol described herein has been used in the Plant Pathology lab,
Sugarcane Breeding Institute, Coimbatore. The protocol has been standardized for use with sugarcane
leaf and cane samples. Suitable methodologies may be adapted for different crops based on
experimental results using different methods of protein extraction.
Sample preparation includes the following steps:
1. Take one gram of fresh tissue, grind in liquid nitrogen to a fine powder.
2. Resuspend the powder in an ice-cold solution of 10% w/v trichloroacetic acid (TCA) in
acetone with 0.07% w/v Dithiotrietol (DTT) for at least 1 h at -20 C.
3. Centrifuge it for 30 min at 12,000 rpm and discard the supernatant.
4. Rinse the pellet thrice with acetone containing 0.07% w/v DTT for 1 h at -20 C.
5. Lyophilize the pellet for two hours to remove any traces of acetone.
6. Solubilize the resulting lyophilized powder in lysis buffer (7 mM urea, 4% CHAPS, 14 mM
DTT, and 0.2% Ampholyte) for 1 h at 37C.
7. Centrifuge at 12,000 rpm for 15 min.
8. Collect the supernatant in a fresh tube.
9. Quantify the protein concentration using the Bradford method (Bradford, 1976).

Note:
- The samples must be stored at -80 C, if stopped at any step during the sample preparation.
- All the reagents and buffers should be prepared with ultra pure chemicals and use ddH
2
O in
all the steps.
- DTT should be added freshly wherever applicable.
2.2. Rehydration
Dissolve 100g of protein in rehydration buffer containing 8 M urea, 2% w/v CHAPS, 18 mM
DTT, 0.5% w/v IPG buffer pH 47 and a trace of bromophenol blue. The steps mentioned below are
to be carried out.
1. Clean the Immobiline strip tray (Ettan IPGphor) and wipe out with paper towels and Kimwipes.
2. Take the strips from - 80 C and remove the plastic cover carefully.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 79

3. Apply the sample on the strip tray and carefully place the strip over the sample, ensuring that the
entire length of the strip touches the sample.
4. Cover the strip tray with coverfluid and close the tray.
5. Leave the sample tray at room temperature for 18 h.
Note : Care should be taken to wear gloves while handling the strips and ensure that the gel side of
the strip faces down.
2.3. First Dimension Isoelectric focusing (IEF)
1. After rehydration, wet the pre-made IEF strips with HPLC-grade water.
2. Dry the strips slightly between two pieces of Whatman paper to remove water.
3. Make sure that the square end of each strip is at the cathode (Black/-) and the +pointed end at
the anode (Red/+). Also note that the anode and cathode ridges are in the correct orientation.
4. Electrophores for 24-36 hr or 45000 Vhrs, using the following sequence of settings:
Voltage Amps Wattage Time
500 V 100mA 33V 1 hr
1000 V 110mA 70V 1 hr
2950 V 140mA 32V 24 hr
5. Bromophenol blue migrates towards the anode within 1 hr from the start of the electrophoresis.
Note: If by the next day the bromophenol blue has not disappeared (strip becomes colorless),
continue running until the dye disappear.
2.4. Second Dimension
2.4.1. Equilibration
Equilibrate the focused strips twice for 15 min in 20 ml equilibration solution as follows:
1. First equilibration in solution containing 6 M urea, 30% glycerol, 2% SDS, 2%
DTT, 50 mM TRIS-HCl buffer (pH 8.8).
2. Second equilibration in solution containing 2.5% iodoacetamide in place of DTT.
2.4.2. Second-Dimension SDS-PAGE:
Perform second dimensional electrophoresis with 1 mm thick, 12 % SDS- polyacrylamide gel
in BioRad Protean xi - vertical slab gel electrophoresis unit.
Casting of acrylamide gels (70 ml) and Reagents Preparation:
Acrylamide gel % 12 % 15 %
40 % Acrylamide 21 ml 26.5 ml
1.5 M Tris HCl pH 8.8 17.5 ml 17.5 ml
ddH
2
O 31.5 ml 26.25 ml
10% (w/v) APS* 0.35 ml 0.35ml
TEMED 17.5 l 17.5 l
*Ammonium persulphate (APS): 0.1 g per ml of ddH
2
O. Prepare freshly each time.
Electrophoresis buffer (25 mM Tris, 190 mM glycine and 0.1% SDS)
1% agarose in electrophoresis buffer.
After casting the gel, perform the following steps:
1. Add 1x running buffer powder and 2 liter ddH
2
O to the electrophoresis tank.
Allow the buffer to mix and cool at least 3 hr before running the gels.
2. Rinse top of gel with 1x electrophoresis buffer.
3. Align the acidic end of the strip with the left end of the gel and lower the strip carefully using a
pair of forcep. Make sure that the strip lies flat on the surface of the gel by gently pressing down
with two rulers. Eliminate bubbles between the gel surface and the strip.
4. Overlay 1-2ml of agarose and allow it to solidify (10 min).
5. Place the 2-D gels into the electrophoresis tank by sliding gel plate sets between the rubber gaskets.
Lubricate the gasket with the running buffer prior to inserting the plates. Make sure gaskets are
not folded and that they form a smooth seal along the entire length of each set of gel plates.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 80

6. Place the lid over the gel tank, ensuring the electrodes make good connection.
Plug leads into power supply.
7. Run the gel at a constant current of 8 mA for 18-20 hr.
8. Turn off the power supply and remove gels from unit
9. Gels can be stained by either coomassie brilliant blue (CBB) staining or silver staining methods.
Note: Do not let the agarose to solidify for more than 15 min as the proteins will begin to diffuse. If
agarose solution fails to polymerize fully within the first 10 min, place the gel plate at 4C for
approximately 2 to 4 min.
2.5. Gel Staining
I. Coomassie brilliant blue staining
Reagents:
Fixing solution : 7% acetic in 50% methanol prepared with ddH
2
O
Dye solution : 0.1% Coomassie brilliant blue R 250 in fixing solution
Destaining solution : 5% acetic acid in 20% methanol prepared with ddH
2
O
1. After SDS-PAGE, transfer the gel to a plate containing the fixing solution and shake for at least 1
h.
2. Pour out the fixing solution, and replace with the dye solution and incubate for 20 min.
3. Destain the gel with destaining solution and continue with fresh solution until the background is
clear.
4. Wash the gel thrice with ddH
2
O for 5 min.
5. Acquire the image of the gel in a densitometer (Bio-Rad or GE health care).
6. Gels can be stored in ddH
2
O at 4C for several months.
II. Silver Staining
Reagents
Fixing solution : 10% acetic acid in 40% ethanol prepared with ddH
2
O.
Sensitizing solution* : 30% ethanol, 0.2% sodium thiosulphate and 6.8%
sodium (oxidizer) acetate in ddH
2
O.
Silver nitrate solution * : 0.25% silver nitrate in ddH
2
O.
Developing solution* : Add 0.015% formaldehyde to 2.5% sodium carbonate
prepared in ddH
2
O just before use.
Stop solution : 5% acetic acid in ddH
2
O.
Note: * - To be prepared fresh.
Staining procedure (with gentle shaking all through)
1. Place the gel in a tray containing fixing solution and agitate on a shaker for at least 1 hr. Ensure
that the fixer solution covers the gel completely.
2. Drain the fixer solution from the tray.
3. Add sensitizing solution and agitate for 30 min.
4. Drain the sensitizing solution from the tray.
5. Wash the gel thrice in ddH
2
O for 5 min.
6. Add silver nitrate solution and agitate for 20 min.
7. After 20 min, drain the silver nitrate solution into an appropriate waste beaker.
8. Wash the gel twice in ddH
2
O for 1 min.
9. Add developing solution to the gel, and agitate until yellow or until brown "smokey" precipitate
appears. Then pour off developer, add fresh developer as needed and continue in this manner
until desired intensity of spots is achieved.
10. Drain the developing solution, add stopper solution and agitate for 10 min.
11. Wash the gel thrice in ddH
2
O for 5 min.
12. Acquire the image of the gel with a densitometer.
13. Store the gels in ddH
2
O at 4C.
Note: Gels can be stored at 4C in ziplock bags for up to two years.
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 81

2.6. Image Analysis
Analyze gel image using IMAGE MASTER Software (GE Health care) or other suitable softwares
and mark protein spots for excision.
2.7. Excision of Protein Spots (for sequencing by Mass spectrometry)
1. Assign the spot(s) in the gel that are to be sequenced.
2. Cut out the protein spot with a pipette tip.
3. Transfer the gel piece to a microfuge tube.
4. Chop up the gel piece with pipette tip.
5. Add a solution of 50% methanol/10% acetic acid to the gel pieces.
6. Incubate for 30 min.
7. Spin down and discard the supernatant.
8. The sample is ready for Mass Spectrometry Sequencing.
2.8. Useful Links and references :
Ramesh Sundar, A., Nagarathinam, S., Ganesh Kumar, V., Rahul, P.R., Raveendran, M., Malathi, P.,
Ganesh Kumar, A., Rakwal, R., Viswanathan, R. (2010) Sugarcane proteomics: Establishment of a
protein extraction method for 2-DE in stalk tissues and initiation of sugarcane proteome reference
map. Electrophoresis 31: 1959-1974.
Commercial 2-D Electrophoresis and Proteomics Sites
Amersham Biosciences
http://www5.amershambiosciences.com/APTRIX/upp00919.nsf/Content/proteomics_HomePage
BioRad Proteomics Workstation
http://www.proteomeworks.bio-rad.com/
Genomic Solutions
http://www.genomicsolutions.com/
2-D Analysis Software Sites
Nonlinear Dynamics: http://www.phoretix.com/
PDQuest: http://proteomeworks.bio-rad.com/html/tech5.html
Flicker for 2D gel analysis: http://www-lecb.ncifcrf.gov/flicker/
NCI/NCRDC LMMB Image Processing Section (GELLAB software):
http://www-lecb.ncifcrf.gov/lemkin/gellab.html
Compugen (Z3 software): http://www.2dgels.com/
Expasy Index to 2D PAGE databases and services:
http://www.expasy.ch/ch2d/2d-index.html
HSC 2DE Gel Protein Databases list:
http://www.harefield.nthames.nhs.uk/nhli/protein.html
Phosphoprotein Database: http://www-lecb.ncifcrf.gov/phosphoDB/
Cambridge Proteomics Facility:
http://www.bio.cam.ac.uk/proteomics/index.html
Rice 2D Database: http://semele.anu.edu.au/2d/2d.html
COMPLUYEAST-2D PAGE Database: http://babbage.csc.ucm.es/2d/2d.html
Danish Centre for Human Genome Research 2D PAGE Databases (Aarhus):
http://proteomics.cancer.dk/
Siena-2DPAGE: http://www.bio-mol.unisi.it/2d/2d.html
PMMA-2D Page at Purkyne Military Medical Academy, Czech:
http://www.pmma.pmfhk.cz/
HP-2D PAGE (Max Delbruck Center, Berlin):
http://www.mdc-berlin.de/~emu/heart
MitoDatMendelian Inheritance and the Mitochondrion:
http://www-lmmb.ncifcrf.gov/mitoDat/
SWISS-2DPAGE at Geneva University Hospital:
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 82

http://www.expasy.ch/ch2d/ch2d-top.html
Proteome BioKnowledge Library: http://www.proteome.com/YPDhome.html
Yeast Proteome Map: http://www.ibgc.u-bordeaux2.fr/YPM/
Melanie: http://us.expasy.org/melanie
Sources of Information and Methods on 2-D Electrophoresis and Proteomics
Australian Proteome Analysis Facility: http://www.proteome.org.au/
The Tubingen Proteome Project: http://www.uni-tuebingen.de/uni/kxm/Proteome/
University of Aberdeen Protein Lab and Proteomics Facility:
http://www.abdn.ac.uk/~mmb023/proteome/index.htm
The EXPASY Swiss 2D-PAGE http://www.expasy.ch/
The Harefield Hospital in London provides links to worldwide databases,upcoming meetings, 2-D gel
analysis software, and more:
http://www.harefield.nthames.nhs.uk/nhli/protein/
The laboratory of Dr. James R. Jefferies, parasitology Group, Institute of Biological Sciences,
University of Wales at Aberystwyth, Ceredigion, Wales, UK:
http://www.aber.ac.uk/~mpgwww/Proteome/Tut_2D.html#Section 1
Proteomics tools for mining sequence databases in conjunction with Mass Spectrometry experiments:
http://prospector.ucsf.edu/
Websites for theoreticaland technical procedures on 2-D gel electrophoresis
+ http://www5.amershambiosciences.com/applic/upp00738.nsf/vLookupDoc/172581038-
R140/$file/80-6429-60AB_Version_May_2002.pdf
+ http://www.bio-rad.com/LifeScience/pdf/Bulletin_2651.pdf
+ http://proteomics.cancer.dk/procedures/procedure.html
+ http://www.aber.ac.uk/parasitology/Proteome/Tut_2D.html
+ http://ca.expasy.org/ch2d/protocols























National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 83

Bioinformatics- Data mining tools, Identification of microsatellite sites,
EST analysis and Annotation
S. J. Eapen
Indian Institute of Spices Research, Calicut 673 012, Kerala

In Silico Analysis - Annotation of ESTs, SSR and SNP identification

SSR identification
Exercise 1. Collection of ESTs.
1. Go to NCBI site http://www.ncbi.nlm.nih.gov/ select db EST and type Citrus macrophylla in
text box.
2. Observe the results and download the fasta format file for analysis.
3. By selecting the display format as fasta will provide the fasta file.
4. By selecting file, whole ESTs can be downloaded in a single file.
Exercise 2. Assembling the ESTs
1. Go to CAP3 website http://mobyle.pasteur.fr/cgi-bin/MobylePortal/portal.py?form=cap3 and
type your Email ID.
2. Paste the EST sequences in the text box or upload the file by selecting file option.
3. Click the run button.
4. CAP3 will make the EST sequence in to contigs and singletons.
5. Once analysed, save the contigs file and singleton file.
6. These contigs were used for further analysis.
Exercise 3. Detecting SSRs using contigs ESTs
1. Go to WEBTROLL (Tandom Repeat Occurrence Locator) website
(http://wsmartins.net/webtroll/troll.html).
2. In WEBTROLL upload the contigs file for the analysis of SSRs (mono, di, tri, tetra, penta
repeats).
3. Click troll it button.
4. Observe the results and by clicking the Design primer button you can design the primers.
5. Collect the repeats and make a table in a proper manner using Excel.
Exercise 4. Detecting SSRs using MISA- MIcroSAtellite identification tool (MISA) tool
1. Go to MISA website http://pgrc.ipk-gatersleben.de/misa/ download misa.pl and misa.ini and
put in a folder in LINUX OS.
2. Download fasta file and put it in a same folder.
3. In command line type misa.pl <FASTAfile>
4. Get the result output <FASTfile>.misa, <FASTfile>.statistics and interpret your results.

SNP identification

Exercise 4. SNP discovery using ESTs
1. This part is analysed only on LINUX OS.
2. AUTOSNP program is customized and used for detecting SNPs.
3. CAP3 is integrated in AUTOSNP is used for making contigs.
4. In command mode just type ./cap3snp.pl f piper.fasta if it is ace file means type
./cap3snp.pl a piper.ace for analysis.
5. AUTOSNP will convert the fasta file in to results of html files.
6. Count the detected SNPs and DNA substitution (indel, transition, transversion) for
interpretation

National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 84

Expressed Sequence Tags Analysis & Annotation

EST clustering and assembly:
Currently the majority of the coding portion is in the form of expressed sequence tags (ESTs), and the
need to discover the full length cDNAs of each human gene is frustrated by the partial nature of this
data delivery. There is significant value in attempting to consolidate gene sequences as they are
produced, in lieu of a yet-to-be-completed reference sequence. ESTs offer a rapid and inexpensive
route to gene discovery, reveal expression and regulation data, highlight gene sequence diversity and
splicing, and may identify more than half genes of organisms. Unfortunately, most EST data remains
unprocessed, and thus does not provide the important high value sequence consensus information that
it contains. The low quality sequence data provided can be much improved on, and in order to achieve
quality information, pre-processing, clustering and post-processing of the results is required. The
steps for EST processing are given below.
Exercise 1. Collection of ESTs.
NCBI dbEST is a division of GenBank that contains sequence data and other information on "single-
pass" cDNA sequences, or "Expressed Sequence Tags", from a number of organisms. For
downloading the complete EST sequences of organisms of your interest type the scientific name of
organism in the text box, available EST sequences of organism of your interest can be obtained
1. Go to NCBI site http://www.ncbi.nlm.nih.gov/ select db EST and type Phytophthora capsici
in text box.
2. Observe the results and download the fasta format file for analysis.
(FASTA format is a text-based format for representing either nucleotide sequences or
peptide sequences, in which base pairs or amino acids are represented using single-letter
codes. The format also allows for sequence names and comments to precede the sequences.
Which begins with '>', and then give a name and/or a unique identifier for the sequence)
3. By selecting the display format as fasta will provide the fasta file.
4. By selecting file, whole ESTs can be downloaded in a single file.
Exercise 2: Vector Screening
Downloaded ESTs may contain vector and poly A tail contaminations, these vector sequences and
poly A tail sequences must be removed to avoid errors during annotation. The vector screening step
will show you whether your EST sequences contain Vector contamination.
1. Go to http://www.ncbi.nlm.nih.gov/VecScreen/VecScreen.html for removing vector
contamination.
2. Copy and paste your fasta file in to the text box.
3. Click Run VecScreen button.
4. Find the similarities using Vector Blast. If similarities found delete the similar sequence from
the fasta file.
5. Use the preprocessed file for further analysis.
Exercise 3: TrimEST
If the vector sequence is detected during vector screening, use TrimEST tool to trim out the vector
sequences present in the EST.
6. Go to the website http://inn.weizmann.ac.il/cgi-
bin/EMBOSS/emboss.pl?_action=input&_app=trimest for removing PolyA tail.
7. Browse and choose your fasta file or paste your EST sequence.
8. Observe the options field manipulate it and click run trimEST. Immediately output file will
open.
9. For larger size sequences submit mail ID so that you will get a mail your job is over or not.
Exercise 4: TrimSeq Trim ambiguous bits off the ends sequences. Specifically, it removes all gap
characters from the ends, removes X's and N's (in nucleic sequences) from the ends, optionally
removes *'s from the ends, optionally removes IUPAC ambiguity codes from the ends (B and Z in
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 85

proteins, M, R, W, S, Y, K, V, H, D and B in nucleic sequences). It then optionally trims off poor
quality regions from the end, using a threshold percentage of unwanted characters in a window which
is moved along the sequence from the ends. The unwanted characters which are used are X's and N's
(in nucleic sequences), optionally *'s, and optionally IUPAC ambiguity codes.

1. Go to the website http://mobyle.pasteur.fr/cgi-bin/MobylePortal/portal.py?form=trimseq.
2. Check file option and browse and choose your file click the RUN button to trim the sequence.
3. Save the TrimSeq output file for further analysis.
Exercise 5. Repeat masking
Repeat masking is not a necessary step; this tool is used to mask the repeated regions of EST, which
may create problems for clustering algorithm for EST assembly.
1. RepeatMasker - screens DNA sequences in FASTA format against a library of repetitive
elements and returns a masked query sequence ready for database searches.
2. Go to the website http://repeatmasker.org in services click repeatmasking
3. Browse and choose your fasta file or paste your EST sequence.
4. Check search engine as wublast and choose the DNA source.
5. Result output will be in HTML format.
6. This repeat masked file is used for clustering.
Exercise 6. Clustering and Assembling the ESTs
A cluster is fragmented, EST data (DNA or protein) and (if known) gene sequence data, consolidated,
placed in correct context and indexed by gene such that all expressed data concerning a single gene is
in a single index class, and each index class contains the information for only one gene. Clustering
refers to assembling sequences in specific order as such; they were placed in the genome of organism.
1. Go to CAP3 website http://mobyle.pasteur.fr/cgi-bin/MobylePortal/portal.py?form=cap3 and
type your Email ID.
2. Paste the EST sequences in the text box or upload the file by selecting file option.
3. Click the run button.
4. CAP3 will make the EST sequence in to contigs and singletons.
5. Once analyzed, save the contigs file and singleton file. These contigs and singleton were used
to further analysis.
EST Annotation:
Genome or EST annotation is the process of attaching biological information to sequences. It
consists of two main steps:
1. Identifying elements on the genome, a process called gene prediction, and
2. Attaching biological information to these elements.
The basic level of annotation is using BLAST for finding similarities, and then annotating genomes
based on that. However, nowadays more and more additional information is added to the annotation
platform. The additional information allows manual annotators to deconvolute discrepancies between
genes that are given the same annotation. Some databases use genome context information, similarity
scores, experimental data, and integrations of other resources to provide genome annotations through
their Subsystems approach. Other databases (e.g Ensembl) rely on both curated data sources as well
as a range of different software tools in their automated genome annotation pipeline.
Structural annotation consists of the identification of genomic elements.
- ORFs and their localization
- gene structure
- coding regions
- location of regulatory motifs
Functional annotation consists of attaching biological information to genomic elements.
- biochemical function
- biological function
National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 86

- involved regulation and interactions
- expression
These steps may involve both biological experiments and in silico analysis. A variety of software
tools have been developed to permit scientists to view and share genome annotations.
Steps
Exercise 1. The clustered EST sequence obtained from EST clustering and assembly step is used
for annotation of EST
Exercise 2. Annotation of ESTs using blast
1. Go to BLASTX site (Search protein database using a translated nucleotide query)
http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Translations&PROGRAM=blastx&BLAST_P
ROGRAMS=blastx&PAGE_TYPE=BlastSearch&SHOW_DEFAULTS=on
and paste your contigs sequence for blast search.
2. Type the organism as Phytophthora with default parameters.
3. Observe the search results.
4. In Blast results click the gene ontology GO and observe the function of the gene.
5. Prepare the table for the functional annotation.
6. Interpret your results.
Exercise 3. Annotation of ESTs using ESTEXPLORER
1. Go to ESTEXPLORER (Web server) site and observe the interface for EST analysis
(http://estexplorer.els.mq.edu.au/estexplorer/main_page.php ).
2. Select organism as Phytophthora
3. Check EST sequences and upload your data.
4. Tick PHASE I , PHASE II, PHASE III
5. Provide your name and mail ID
6. Click Process data button.
7. It will provide the request ID to see the results via status of the work.













National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 87

SEQUENCE-BASED MARKER DESIGNING
Rajesh M.K.
1
and Senthil Kumar R.
2
1
Central Plantation Crops Research Institute, Kasaragod 671124, Kerala
2
Indian Institute of Spices Research, Appangala, Karnataka

Introduction
The advent of next-generation sequencing (NGS) has revolutionized genomic and
transcriptomic approaches to biology. These new sequencing tools are also valuable for the discovery,
validation and assessment of genetic markers in populations. Restriction enzymes have been a core
tool for marker discovery and genotyping for decades, ever since the development and use of RFLPs.
The diversity of restriction enzymes available (which vary in the length, symmetry or GC versus AT
bias of their recognition sites, and also in their methylation-sensitivity) makes them an extremely
versatile assay tool. Their flexibilities allow researchers to customize marker discovery approaches to
individual projects.

Methodology
1. Sequencing of the genome and comparison with the reference genome.
When a reference genome sequence is available, sequence reads produced by any of the
technologies can be aligned and positioned on a physical map. The higher the quality of the
reference genome assembly, the easier it is to impute missing genotypes, thus reducing the
coverage that is required to genotype each individual. Reference genomes can also be used to
design marker discovery experiments by simulating in silico the number of markers produced
by different enzymes. Challenges arise when a reference genome sequence is not available, or
even when a reference sequence is available but is poorly assembled, comes from a distantly
related taxon or is large and highly repetitive. Reference genomes can also be used to design
marker discovery experiments by simulating in silico the number of markers produced by
different enzymes.

2. SHORE analysis of the two genomes
SHORE, for Short Read, is a mapping and analysis pipeline for short DNA sequences
produced on a Illumina Genome Analyzer. It is designed for projects whose analysis strategy
involves mapping of reads to a reference sequence. This reference sequence does not
necessarily have to be from the same species, since weighted and gapped alignments allow
for accuracy even in diverged regions. The reads of the newly sequenced genome are aligned
to the reference genome to detect SNPs.

3. Retrieval of sequences around the region of SNP
Sequences, around 500 bp, are retrieved from the two genome sequences.

4. Detection of restriction enzyme sites within the two sequences
The two sequences are compared with respect to unique restriction enzyme site present in any
one of the sequences.

5. Design of primers for amplification of the sequence of interest
Primers are designed for amplification of the sequence of interest using primer design
softwares.

National training on Allele Mining 12
th
- 25
th
Sept, 2011, IISR, Calicut
Laboratory Manual 88

Annexure I
General Conversion Tables and Formulae

Common Conversions of Nucleic Acids and Proteins
Weight conversion
1g = 10
-6
g
1 ng = 10
-9
g
1 pg = 10
-12
g
1 fg = 10
-15
g

Spectrophotometric conversion
1 A
260
unit of double-stranded DNA = 50 g/ml
1 A
260
unit of single-stranded DNA = 33 g/ml
1 A
260
unit of single-stranded RNA = 40 g/ml

DNA molar conversions
1 g of 1,000 bp DNA 1.52 pmole (3.03 pmoles of ends)
1 pmole of 1000 bp DNA = 0.66 g

Protein molar conversion
100 pmoles of 100,000 dalton protein = 10 g
100 pmoles of 50,000 dalton protein = 5 g
100 pmoles of 10,000 dalton protein = 1 g

Protein/DNA conversion
1 kb of DNA = 330 amino acids of coding capacity 3.7 x 10
4
dalton protein
10,000 dalton protein = 270 bp DNA
50,000 dalton protein = 1.35 kb DNA
100,000 dalton protein = 2.7 kb DNA