Beruflich Dokumente
Kultur Dokumente
Seth C. Murray
Assistant Professor Quantitative Genetics and Maize Breeding 09/10/10 TAMU Plant Breeding Roundtable
BIG PICTURE Why Crop Improvement and Genetic Diversity Understand Genetics for Review of Genetic Variation - Focus on Gene (Point) Mutations Crop Improvement
What are Morphological Markers? What are Molecular Markers? - Restriction Fragment Length Polymorphisms - Polymerase Chain Reaction - SSRs - SNPs - Sequence Based What is a Quantitative Trait Locus QTL? How do you perform QTL mapping? What is the difference between QTL and a gene?
Overview
FOCUS What is a (Molecular) Marker and How Does it Help Characterize Diversity?
FOCUS What is a QTL and How Does it Help us to Characterize and Use Diversity?
DISCUSSION: Using QTL for Crop Improvement - Crop Improvement via Linked Loci - Crop Improvement via specific genes - Transgenics
Crop Landraces
Wild Species
insertion
deletion
insertion
CG Transversion
What is a Marker?
-Websters Dictionary defines as: something that serves to identify, predict, or characterize [the GENETIC VARIATION present] Morphological (phenotypic) markers - A trait you can observe and/or measure as different between two individuals (must be heritable, genetic). (Example ~ corn mutants)
Genetic (molecular, DNA) markers - A measurable DNA mutation which may or may not have an effect on the phenotype (also must be heritable, genetic).
Molecular markers are much more common than phenotypic markers Most gene (point) mutations do not result in phenotypic changes.
www.cals.cornell.edu/.../images/mutant-corn.jpg
http://www.animalgenome.org/edu/QTL/Julius_notes/05_linkagemap.PDF
-Developing the first morphological (phenotypic) markers and linkage maps - Corn mutants - Chromosome 4 mutant linkage map
www.cals.cornell.edu/.../images/mutant-corn.jpg
Burnham Beadle (Nobel in 1958) Rhodes Emerson McClintock (Nobel Prize in 1983)
- PCR (Polymerase Chain Reaction) -Very little DNA needed -AFLPs -SSRs -Sequencing and SNPs
Restriction Digests for RFLPs DNA Strand Restriction Enzyme Cuts Specific DNA Patterns
G/AATTC 80kbp - kilobase pairs 50kbp G/AATTC 10kbp
Digested DNA
DNA standard
100kbp 50kbp 20kbp 10kbp
Restriction Fragment Probes Radioactive probe that binds to specific DNA sequence GGCCTTAATTCCGG
GGCCTTAATTCCGG 80kbp G/AATTC 50kbp G/AATTC 10kbp
Measurable Mutations!
Annealing
Extension
Multiple rounds of denaturation-annealing-extension are performed to create many copies of the template DNA between the two primer sequences.
Stutter
Stutter
NICE
seq.mc.vanderbilt.edu/DNA/images/mma.jpg www.epibio.com/f6_1/Fig2trace.gif
Agro 643 Molecular Markers
NICE
SNPs (Single Nucleotide Polymorphisms) -Newest, most popular marker -Detects a single base pair (bp) mutation only -Must find the polymorphism first by sequencing
genecodes.com/.../Var_detail_report.gif bioinformatics.utmem.edu
aa
Aa aA
AA
www.biotech.uiuc.edu
Agro 643 Molecular Markers
Kbiosciences systems
http://www.kbioscience.co.uk/
pipeline
http://www.genomecenter.ucdavis .edu/dna_technologies/prices.htm l
Illumina Golden Gate Genotyping Bead Array 96 SNPs (per sample) Bead Array 384 SNPs (per sample) Bead Array 768 SNPs (per sample) Bead Array 1536 SNPs (per sample) BeadXpress 96 SNPs (per sample) BeadXpress 384 SNPs (per sample) 1536 SNP bead chip, 16 samples 1536 SNP bead chip, 32 samples
http://www.sequenom.com/
http://www.illumina.com/
http://www.sequenom.com/
Agro 643 MAS and Genomic Selection Genotyping Platforms
Dr. Patricia Klein will be speaking on her work in this area here on Oct. 1st!
Amber
Labate, J., K.R. Lamkey, M. Lee, and W.L. Woodman. 1999. Population genetics of increased hybrid performance between two maize populations under reciprocal recurrent selection. p. 127137. In J. Coors and S. Pandey (ed.) Genetics and Exploitation of Heterosis in Crops, CIMMYT, Mexico City. 1722 Aug. 1997. ASA, Madison, WI.
Agro 643 - Relationships and Genetic Diversity Measurements and Visualizations of Genetic Diversity
What is a Quantitative Trait Locus (QTL) A statistically significant locus (not necessarily a gene) that quantitatively affects a phenotype of interest with physical boundaries defined by linked molecular markers.
Composite Interval Mapping Single Marker Analysis
AA aa Aa
QTL
QTL
Genetic Markers
What is the plural of QTL?: Quantitative Trait Loci , but can still be called QTLs to draw attention to the fact that there is more than one.
QTL and QTL mapping What do we need to map QTL? - A controlled segregating population -*Heritable variation in the population is necessary, phenotypic variation in the parents is NOT (think of transgressive segregation; parents with different genes for height can phenotypically look the same.) - Phenotypic data - A molecular marker based linkage map - Recombination and linkage disequilibrium What is the mapping strategy (simple overview) -Test phenotypic value difference in progeny separated by marker state for significant difference (t-test, ANOVA, regression) - A significant difference is indicative of a marker linked to a QTL - Difference between mean value of separated progeny classes is an estimate of the QTL effect. - Replicate and test across environments to: - Minimize error variance - Identify QTL that are consistently expressed - QTL only expressed in one (rare) environment are of little use except if preparing for a stress expected to become more common
Single marker QTL analysis (F2) Simplest Case of a Perfect Marker Basic Regression - Code genotypic data (Parent 1 alleles = 0, Parent 2 alleles =1) - Missing genotypes get treated as the mean probability of both parents (0.5 for F2 or RILs, 0.75 for backcross 1) - Create genetic map (not necessary for most basic test) - Prepare phenotypic data (BLUPs, location means, transform to normality) - Regress genotypes onto phenotypes (same result as t-test, ANOVA) - Significant genotypic marker means the marker is likely linked to a QTL - Estimation of the regression slope = estimate of QTL effect
HEIGHT (CM)
AA Aa
aa
QTL and QTL mapping Five primary types of QTL mapping with increasing complexity and (theoretically) power - Single marker analysis - Interval mapping (IM) - Composite interval mapping (CIM) - Multiple interval mapping (MIM) - Bayesian ( Hidden Markov Model) - Others that are more rare. Variety of programs for QTL mapping (only free software) - QTL Cartographer - Command Line - WinQTL Cartographer - Nicest GUI - Less up to date then QTL Cartographer - MapQTL5 - Nice GUI - PLABQTL - Command Line -R/QTL - Command line / Most flexible - Offers Bayesian (most technically complex ) R/QTL - for more Brian Yandell keeps a great reference at: http://www.stat.wisc.edu/~yandell/statgen/reference/software.html
F2/ F3
Good - Quick to create - Can estimate both additive and dominance effects Bad - Lower power (more unknowns especially with dominant markers) - Not immortalized genetic map is only good for that generation - Limited to no ability to replicate (environments, replicates) - Limited recombination
Doubled Haploid
Good - Quick to create - Immortalized and easily replicated and shared Bad - Limited recombination - Can be difficult and expensive - Can only look at additive effects (no heterozygotes)
Backcross
Good - Can be combined with trait introgression breeding - Moderate recombination Bad - Difficult to replicate unless further inbred - Can not evaluate additive effects (no donor parent recessive homozygotes)
Population derived from an Elite x Elite cross (Only progeny must segregate)
- Primary improvement may only be on transgressive segregation
Agro 643 QTL Mapping Types of Populations
Population derived from an extreme low parent x extreme high parent cross (Note parents and progeny segregate)
Rio BTx623
HeightHgt_(cm) Flowering_time Flower Tiller
stand_density
S diameter Mean_stem_thickness
Total_Biomass_dry_yield Biomass
Sugar Sugar_yield
Brix Starch_grain G starch
S h-cellu Hemi-cellulose_stem(%solids)
Lignin_stem(%solids) S lignin Crude_protein_stem(%solids) S protein Cellulose_leaf L Cellulose Hemicellulose_leaf L h-cellul Lignin_leaf L lignin Crude_protein_leaf LAgro 643 QTL Mapping QTL Verification Multiple Traits protein
Rio QTL Mapping For Biomass in Stem and Leaf Tissue in College Station, TX 2005 BTx623 Chr. 1 Chr. 2 Chr. 3 Chr. 4 Chr. 5 Chr. 6 Chr. 7 Chr. 8 Chr. 9 Chr. 10
HeightHgt_(cm) Flowering_time Flower Tiller
stand_density
S diameter Mean_stem_thickness
Total_Biomass_dry_yield Biomass
Sugar Sugar_yield
Brix Starch_grain G starch
S h-cellu Hemi-cellulose_stem(%solids)
Lignin_stem(%solids) S lignin Crude_protein_stem(%solids) S protein Cellulose_leaf L Cellulose Hemicellulose_leaf L h-cellul Lignin_leaf L lignin Crude_protein_leaf LAgro 643 QTL Mapping QTL Verification Multiple Traits protein
Rio BTx623
HeightHgt_(cm) Flowering_time Flower Tiller
stand_density
S diameter Mean_stem_thickness
Total_Biomass_dry_yield Biomass
Sugar Sugar_yield
Brix Starch_grain G starch
S h-cellu Hemi-cellulose_stem(%solids)
Lignin_stem(%solids) S lignin Crude_protein_stem(%solids) S protein Cellulose_leaf L Cellulose Hemicellulose_leaf L h-cellul Lignin_leaf L lignin Crude_protein_leaf LAgro 643 QTL Mapping QTL Verification Multiple Traits protein
Rio QTL College Station, TX 2005 QTL Co-localization Linkage vs. Plieotropy BTx623 Chr. 1 Chr. 2 Chr. 3 Chr. 4 Chr. 5 Chr. 6 Chr. 7 Chr. 8 Chr. 9
HeightHgt_(cm) Flowering_time Flower Tiller
stand_density
Chr. 10
S diameter Mean_stem_thickness
Total_Biomass_dry_yield Biomass
Sugar Sugar_yield
Brix Starch_grain G starch
S h-cellu Hemi-cellulose_stem(%solids)
Lignin_stem(%solids) S lignin Crude_protein_stem(%solids) S protein Cellulose_leaf L Cellulose Hemicellulose_leaf L h-cellul Lignin_leaf L lignin Crude_protein_leaf L protein
QTL Meta-analysis Using 50 separate disease resistance QTL studies in maize to understand broad spectrum quantitative disease resistance
Wisser RJ, Balint-Kurti PJ, Nelson RJ (2006) The genetic architecture of disease resistance in maize: a synthesis of published studies. Phytopathology 96:120129
QTL Meta-analysis and Candidate Genes Leverage 16 separate published QTL studies along with a sequenced genome helps to further gain detection power.
Wisser, R.J., Q. Sun, S.H. Hulbert, S. Kresovich, and R.J. Nelson. 2005. Identification and characterization of regions of the rice genome associated with broadspectrum, quantitative disease resistance. Genetics 169:22772293.
100 N = 600 90
Power (%)
Heritability
Utz and Melchinger, 1994
Agro 643 QTL Mapping Sample Size and Power
Bernardo, 2004
F1
Perform statistical test for significance (Genotype vs. Phenotype) based on a null model Marker Phenotype Significance
RFLP 12 AFLP 57 SSR 26 Height 0.0001*** Grain Weight 0.051 Disease Resistant 0.0023**
Self to homozygosity Is this marker not important? Or Did we not have enough data to reject the null hypothesis at (p< 0.05)?
RILs
Real Life Challenges? In Real Life If we only had five markers across a chromosome, we would not capture a lot of what is going on which can lead to reduced power and/ or increased error! L M N O P
Chromosome X INDIVIDUAL 1 INDIVIDUAL 2 INDIVIDUAL 3 INDIVIDUAL 4 INDIVIDUAL 5 INDIVIDUAL 6 INDIVIDUAL 7 INDIVIDUAL 8 INDIVIDUAL 9
F2s
Self to homozygosity
RILs
Sample Size and Power Before asking the questions of what sample size we should use and how much detection power we expect to have, we should note the factors that influence this. 1) What is the experimental goal? 2) What is the heritability of a trait? 3) How many QTL are involved? The more QTL to detect, the more individuals and markers you will need 4) How large of a QTL effect do you want to be able to find? To detect smaller and smaller QTL effects we need an exponentially larger population because of the associated error 5) What are the effects of the trait? Dominant, additive, over-dominant, this will effect the population you use and hence the sample size. 6) Is there any reason to believe there is epistasis? Yes! Do you want to detect it probably do not have the resources too. 7) Is there any reason for using a smaller than optimum sample size? Yes! Time to create population, money to genotype and phenotype population
Agro 643 QTL Mapping Sample Size and Power
Bernardo, R. 2004. What proportion of declared QTL in plants are false? Theor. Appl. Genet. 109:419424.
Note that this was a simulation of an F2 population (1 environment) with 150 individuals, 100 markers, multiple regression for detection, no permutation test and =0.05. When the author changed any of these things the results were not so dire.
Agro 643 QTL Mapping General
Null Null hypothesis hypothesis is True is False Reject the Type 1 Null Error! Hypothesis Fail to Type 2 Reject the Error! Null Hypothesis Type III error: provides the right
answer to the wrong question (discrepancy between the research focus and the research question )
Stability in QTL
Most journals would not accept a QTL study with any less than three environments. A major reason for this has to do with stability. If a QTL is only detected in one environment, it suggests it may only be useful in that one environment. A good example is photoperiod response. If two flowering time QTLs are identified, one expressed only in northern latitudes (photoperiod sensitivity) and one expressed in all environments (true flowering time). Introgression of the photoperiod sensitivity QTL is likely to decrease the yield stability where as introgressing a true flowering time QTL is likely to make the plant behave predictably.
QTL Verification QTL Verification Locus effect quantification How large is the difference between alleles? Plieotropy Would unmeasured traits be affected? Are there negative effects? QTL x Environment Interaction Is there a year or environment effect? How large? QTL x QTL interaction Is there epistasis that may make some QTL more or less valuable Underlying gene(s) Can we, do we want to identify these? Approaches for Verification Compare multiple traits Compare in multiple environments Develop and use independent populations Fine Mapping (discussed later) Create Near Isogenic Lines (discussed later) Association mapping verification (discussed later) Cloning & Transformation (discussed later)
MARKER A MARKER B
Identified QTL
Backcross NILs
Are there other alleles that would accumulate even more sugar?
Forward Genetics: Phenotypic Variation QTL Gene Functional Polymorphism Reverse Genetics: Gene Functional Polymorphism Phenotypic Variation
Cloning the gene is when we know the DNA sequence of the gene CAUSING the morphological (phenotypic) difference.
We do this by finding and mapping molecular markers closer and closer to our morphological marker.
This lets us do many neat things for both crop improvement and evolution studies but is A LOT of work! Example: Cloning the First Domestication Gene - Tomato fw2.2
Doebley JF, Gaut BS, Smith BD (2006) The molecular genetics of crop domestication. Cell. 29;127(7): 1309-21
Li, J., M. Thomson, and S.R. McCouch. 2004. Fine mapping of a grain-weight quantitative trait locus in the pericentromeric region of rice chromosome 3. Genetics 168:2187 2195.
150 plants
1000 plants
9000 plants!
Thomson, M. J., J. D. Edwards, E. M. Septiningsih, S. E. Harrington and S. R. McCouch, 2006 Substitution mapping of dth1.1, a flowering-time quantitative trait locus (QTL) associated with transgressive variation in rice, reveals multiple Sub-QTL. Genetics 172: 25012514.
Pedigree
less less
Yu, J. et al. Genetics 2008;178:539-551 Copyright 2008 by the Genetics Society of America
Technology Needed for MAS (and Genetic Fingerprinting) MARKERS x GENOTYPES = DATA POINTS Most of the applications we have discussed so far (gene / polymorphism discovery) involve the identification of many markers on a few number of genotypes to cover the genome.
QTL mapping: 100 1,000 markers X 100-500 individuals = 10,000 to 500,000 data points Association mapping: 100 1,000,000 markers X 100-7000 individuals = 10,000 to 7,000,000,000 data points
Once the subset of useful/ important markers has been established, we now want to evaluate these over many individuals. This requires different technology to be cost efficient.
MAS: 1 100 markers X 100 10,000 individuals = 10,000 to 1,000,000 data points
In general this is a need only for plant and animal breeders, biotechnologists and some people who do gene diversity studies therefore the technology market is smaller than for what human geneticists and evolutionary biologists may use.
Once we find a marker linked to our trait of interest (exp. disease resistance) we can use this marker to make selections rather then screen all of the plants for disease resistance. This is called Marker
Assisted Selection
!!! NOTE: This marker is unlikely to be the point mutation or the gene that gives the disease resistance. It is only LINKED to the disease resistance gene of interest. Thus: WE DO NOT KNOW WHICH GENE CAUSES THE DISEASE RESISTANCE WITH THE MARKER, BUT WE CAN MAKE SELECTIONS FOR DISEASE RESISTANT PLANTS BASED ON THE MARKER.