Beruflich Dokumente
Kultur Dokumente
Guest lecture: Dr Dan Neafsey (Group leader, Malaria Genome Sequencing and Analysis,
Broad Institute)
Scribe notes: Shannon Tunney 15, Lloyd Mccarthy 15, Meng Lai Nicole Wong 14,
Devanshi Patel, 13,
Neutrality: the difference between populations is selectively neutral (i.e. no selection)
Topics:
1. Understanding neutral variation in populations
2. Model-based tests of selection
3. Empirical tests of selection
Understanding neutral variation in populations
Population Genetics: the study of the frequency and interaction of alleles and
genes in populations
Population Genomics: large-scale comparison of DNA sequences of populations
o 1960s theoretical, didnt have very much information
o Present day sequencing is now easy and affordable, leading to huge data
sets and the ability to use genetic variations clinically, useful for health
services and bioengineering
o There are more than 1200 human genome-wide association studies on over
200 traits and diseases involving 100,000s of subjects data has made a
big difference
Mutation + Selection=Evolution
o Objective of Tests: What is the relative contribution of each for maintaining
variation in a population?
H0: Variation profile is neutral
H1: Variation profile is not neutral, possibly caused by selection
Early Critic of Darwin, Fleeming Jenkin:
o Blending inheritance - widespread hypothetical model (at the time) which
theorized that the traits from each parent would blend and yield offspring
with an intermediate of the parents traits
Problem with theory: eventually, everything would be blended and
there would be no variation in offspring. Natural selection does not
work with this model.
o Gemmules: packets of substance in an individual that can blend with other
packets in offspring
o
o Blending inheritance
Gemmules: packets of substance in an individual that can blend with other packets in
offspring
Eventually everything would be blended and there would be no variation selection does
not work in this model
Mendelian inheritance
o Law of Segregation- states the following ideas: allelic variation, offspring
receive 1 allele from each parent, dominance/recessivity, parental alleles
segregate to form gametes
Probability that a sample of n gene copies contains k alleles and that there
are a1, a2, , an alleles represented 1,2, ,n times in the sample:
where
and aj is the number of alleles
found in j copies
This is the probability of observing the profile of polymorphism based
on variability
o If you have a lot of neutral selection, the most genetic variation is singleton
variation
o Example: Malaria is AT-rich (81%)
Categorize mutations into ATCG or CGAT
Which category is driven more quickly to be fixed? CGATATCG
Selection is pushing back the amount of AT in the genomeit back
AT mutations unlikely to be fixed
Coalescence
o Alternate, backwards approach to generating expected allele frequency
distributions
o Attempts to trace all alleles of a gene shared by all members of a population
to a single ancestral copy, known as the most recent common ancestor
o Infer tree structure (genealogy), because tree structure dictates pattern of
polymorphism in data. Common ancestor prediction is 4xN generations (i.e.
time to coalescence of a population is 4N generations)
Studies how far back in time a sample shared a common ancestor
o
P(coalescence)=1/(2N)
=mutation rate
G is a genealogy
The probability of seeing only 3 SNP difference is really small because
we expect to see more than 3
Coalescence tells you inferences about selection
Can only coalesce once, and this can only happen after k mutations
Bigger population greater chance diverse from each other
Small population not many opportunities for divergence, ancestors
close
Turning neutral models into tests of neutrality
3 polymorphism summary statistics
S=number of segregating sites in sample
variables
Frequency-based neutrality tests (Tajima)
o
o
Negative D, Ex. When go back, one gets malaria drug huge fitness
advantage, pushes up frequency of singletons. Example of positive
selection.
Balancing Selection