Trends in Genetics
OpinionLinkage disequilibrium and the mapping of complex human traits
Section snippets
LD in the human genome
Population genetics theory describes the way mutation, gene conversion, recombination, natural selection and the demographic structure of human populations affect patterns of LD. These forces vary along the genome and are generated by highly stochastic (probablistic) processes, so there is currently no a priori way to predict the LD pattern in any particular genomic region, and it must be assessed empirically in appropriately chosen samples.
Sample size effects
∣D′∣ is biased upward inversely with sample size. In the Reich et al. study, the Yoruban samples were twice the number of the European samples, and if only half the Yoruban data are used, their mean LD patterns are closer to the Europeans (Fig. 2). This effect is increased by using a quarter of the Yoruban data (not shown). If such a sample is to be used to design a global mapping panel, it should be large enough that the observed LD structure is not an artifact of small sample size.
LD and haplotypes
We have so far used the Reich et al. data to raise sampling issues regarding SNP choice, and we now illustrate the effects of defining haplotypes from a subset of common SNPs in those data. Reich et al. suggest that haplotypes occur in roughly 50-kb ‘blocks’ that can be characterized by only one or two markers. We extracted five-site haplotypes for individuals in the Reich et al. data who have two unambiguous haplotypes (this avoids issues of statistically inferring probable phasing of the two
Missed opportunities?
The finding in the European samples that there is substantial LD (measured by D′) over 60 kb raises hopes that widely separated SNPs might capture the haplotype variation across the genome. But omitting polymorphic sites collapses existing haplotypes into a restricted number of detectable haplotype classes. If we were to type only the two SNPs 40-kb apart, that is, if we had believed LD in the whole region could actually be captured by the two spanning sites, Table 1 shows that we would miss
Conclusion: a 93% haplotype map or a 7% solution?
An accurate large-scale survey of human single nucleotide polymorphisms would be an exciting and valuable tool so it is important to design it carefully. We have raised several purely genetic issues related to resource design. A primary justification for the proposed tool is the CDCV model (Box 1), but this is not obviously correct. If alleles with large effects on disease risk turn out to be rare and geographically localized, rather than common and global, a haplotype map based on common SNPs
Acknowledgements
We acknowledge NIH/NHLBI grant HL58239.
References (34)
Linkage disequilibrium between microsatellite markers extends beyond 1 cM on chromosome 20 in Finns
Genome Res.
(2001)The genome-wide distribution of background linkage disequilibrium in a population isolate
Hum. Mol. Genet.
(2001)Haplotype variation and linkage disequilibrium in 313 human genes
Science
(2001)- et al.
Why is there so little intragenic linkage disequilibrium in humans?
Genet. Res.
(2001) Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels
Am. J. Hum. Genet.
(2001)High resolution analysis of haplotype diversity and meiotic crossover in the human TAP2 recombination hotspot
Hum. Mol. Genet.
(2000)Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex
Nat. Genet.
(2001)Analysis of mutational changes at the HLA locus in single human sperm
Hum. Mutat.
(1995)- et al.
A comparison of linkage disequilibrium measures for fine-scale mapping
Genomics
(1995) Inference of population structure using multilocus genotype data
Genetics
(2000)