Beruflich Dokumente
Kultur Dokumente
Today
Goal of a genetic association study Rationale for genome-wide association studies Design and analysis considerations for GWAs Application to two clinically similar granulomatous lung diseases
Examples
Some cancers Type 1 diabetes Type 2 diabetes Alzheimer disease Inflammatory bowel disease - Schizophrenia - Cleft lip/palate - Hypertension - Rheumatoid arthritis - Asthma
DNA Variation
>99.9 % of the sequence is identical between any two chromosomes. - Compare maternal and paternal chromosome 1 in single person - Compare Y chromosomes between two unrelated males Even though most of the sequence is identical between two chromosomes, since the genome sequence is so long (~3 billion base pairs), there are still many variations. Some DNA variations are responsible for biological changes, others have no known function. Alleles are the alternative forms of a DNA segment at a given genetic location. Genetic polymorphism: DNA segment with 2 common alleles.
5
A A
T T
G G
A A
C C
A A
G T
G G
C C
Alleles at this SNP are G and T SNPs are the most common form of variation in the human genome SNPs catalogued in several databases
6
Genotype for this individual is GT Haplotype: sequence of alleles along a single chromosome
Maternal Paternal A A T T G G C A C C A A T T G G C C
Functional variants, or those with unknown function in candidate genes More general coverage of region using many markers
Genome-wide
Test for association with hundreds of thousands (millions) of SNPs spread across the entire genome. Many design strategies possible for distributing markers
11
12
If these where the only two haplotypes in the population, then alleles G and A ( C and T) are in perfect linkage disequilibrium. If we genotype the first SNP, we know what the alleles are at the second SNP.
13
In general, LD between two SNPs decreases with physical distance Extent of LD varies greatly depending on region of genome If LD strong, need fewer SNPs to capture variation in a region
14
www.hapmap.org
15
HapMap
Multi-country effort to identify, catalog common human genetic variants. Developed to better understand and catalogue LD patterns across the genome in several populations. Genotyped ~4 million SNPs on samples of African, east Asian, European ancestry. All genotype data in a publicly available data base. Can download the genotype data Able to examine LD patterns across genome Can estimate approximate coverage of a given SNP chip Can represent 80-90% of common SNPs with ~300,000 tag SNPs for European or Asian samples ~500,000 tag SNPs for African samples
16
18
Confounding by Ancestry
(a.k.a. Population Stratification)
Control selection critical as always Confounding by ancestry: Distortion of the relationship between the genetic risk factor and the outcome of interest due to ancestry that is related to both the frequency of the putative genetic risk factor and whether or not subject is a case or a control.
Ancestry
Case/Control Status
20
Population Stratification
Cases Controls Genotype
TT AT AA
Distribution of genotypes differs between cases and controls Might conclude that allele A (or genotype AA) related to disease
21
Population Stratification
Cases Pop 1 Controls Pop 1 Genotype
TT AT AA
Pop 2
Pop 2
Population Stratification
Unequal distribution of alleles may result from Sample made up of more than one distinct population Sample made up of individuals with differing levels of admixture
23
24
Q-Q Plot
If points deviate (significantly?) from line of equality indicate that the two distributions are different. Some will take point at which the observed p-values differ from the expected as the point to declare statistical significance.
Important points: Can have deviation from line that is indicative of violated assumptions (e.g. existence of population stratification) In tails of distribution, have less information, and so might require large divergence from expected
26
Chronic Beryllium Disease (CBD) Exposure to beryllium results in formation of granulomas in lung among some individuals Sarcoidosis Unknown exposure(s) result in granuloma formation and inflammation in lung, but other organs often involved
28
Hypothesis
Sarcoidosis and CBD share genetic factors important in their similar granulomatous inflammatory pathways
Disease Severity
CBD
Disease Risk
Sarcoidosis
29
30
p=10-11
31
32
33
Acknowledgements
Wake Forest University Carl D. Langefeld, PhD University of Michigan Michael Boehnke, PhD Goncalo R. Abecasis, PhD
34