Finding genes underlying risk of complex disease by linkage disequilibrium mapping

doi:10.1016/S0959-437X(03)00056-X

Current Opinion in Genetics & Development

Volume 13, Issue 3, June 2003, Pages 296-302

https://doi.org/10.1016/S0959-437X(03)00056-X Get rights and content

Abstract

Identification of genes that harbor variation associated with inter-individual differences in risk of complex diseases remains one of the most challenging and important problems in human genetics. For genetic variants that are sufficiently common and have sufficiently large effects, direct tests of association through linkage disequilibrium with anonymous SNPs may prove effective. But the two critical parameters — the frequency of risk-inflating alleles and the magnitudes of their effect on risk — remain largely unknown. In this review we consider the latest information regarding the likely efficacy of the linkage disequilibrium mapping approach.

Introduction

Health problems that appear to aggregate within families but that do not segregate like a simple Mendelian gene pose a special problem for researchers trying to either predict the risk of the disorder or to identify relevant genes for understanding etiology. Traditionally, linkage methods have served extremely well for Mendelian disorders, and the same approach of fitting linkage models to pedigree-structured data in which phenotypes and marker genotypes are scored has been reasonably effective for complex traits as well. The problem is that pedigree methods suffer from the fact that the resolution of the mapping depends on both sample size and marker density, and even the largest studies typically have a rather poor resolution of 5–10 cM. More recently it was discovered that one can apply similar analysis to affected sib pairs, noting that sharing of marker alleles and of phenotypes is more likely when the marker is closely linked to segregating variation that causes trait variation. Although affected sib pair (ASP) methods have the big advantage that much larger samples can be obtained, they lack the advantage of acquiring information about linkage phase that a multigeneration pedigree provides, and so in the end the resolution of ASP methods also are less than optimal. The final assessment of the efficacy of using whole genome linkage disequilibrium (LD) scans to find genes associated with risk of complex disease will have to wait until the approach is actually tried. In the meantime, I here discuss recent work that has been done seeking to improve the chances for success of the method by characterizing and analyzing the haplotype structure of human genetic variation.

Section snippets

Directly testing disease association

Risch and Merikangas [1] made the observation that an outbreeding population has some properties like a large extended family — namely there are many meioses in which the association between a marker and a disease-associated allele can recombine. But if the marker and disease-associated alleles are found to be in tight statistical association, this may amount to evidence that they are closely linked. There is a large body of theory behind this notion, and the theory describes many factors that

Linkage disequilibrium across the genome

To assess the overall efficacy and cost of LD mapping, we first need to determine the distribution of spans of the human genome that exhibit LD. This had been done for several human genes by resequencing to obtain many single nucleotide polymorphisms (SNPs) in the same gene 4., 5., 6., 7.. Figure 2 shows the relationship between physical separation between SNPs and two common metrics for LD. Other metrics for LD are evaluated by Devlin and Risch [8]. One problem with this approach is that from

Population subdivision, demography and linkage disequilibrium

There is a long history of interest in inferring the degree of population subdivision from genetic data, and application of this analysis to human genetic markers reveals that ∼8–10% of the genetic variance is found within population groups 12., 13.. When making inferences of association between genes and complex diseases, the need to understand population subdivision is critically important. If one does a case-control study, and the samples under study are a mix of two somewhat isolated

Complex disorders are not simple

Even if LD mapping of single genes were simple, mapping complex traits is an enormous challenge for the same reasons that it is so difficult to draw firm conclusions from epidemiological data. Genetic variation is likely to contribute to overall risk of many complex diseases, but the genetic component may be small compared to some environmental insults, and the fact that genes and environment interact, and that health is something that is deeply context dependent (CF Sing, JH Stengård, SLR

Models for the genetics of complex disorders

There is a long history in genetic analysis that points to the power of a good model. If we formulate a scheme whereby genes affect a trait, we are much more able to test and either reject or accept the model, compared to a more open-ended situation. Key parameters in whole-genome association testing are the number of genes that are having a causal effect on risk, the frequency of the variant alleles, and the magnitudes of effect of those alleles on risk. Before we consider the complexities of

Selection in the human genome

A factor that inflates LD in the human genome more directly and strongly than any other is natural selection. This is especially evident in cases where a single gene has an influence on the risk from a disease, such as the improved resistance to Vivax malaria by people with the Duffy null allele [34] or increased resistance to Plasmodium malaria in individuals with the low-activity alleles of G6PD [35]. The recent generation of near genome-wide datasets on SNP genotypes has opened the

Why HapMap?

The NIH Haplotype Map (HapMap) project is the largest single project in human population genetics ever attempted, and as a result it has received some harsh criticism. As of writing, the exact scope of the project is unclear, but it will entail a large quantity of SNP genotyping in several human population groups. Given that the project will be completed and the genotype data will be collected, the constructive challenge we face is to formulate the best questions and the best use of the

Conclusions

The potential for a disease to be determined by a vast array of extremely rare alleles in many different genes embedded in a network of highly epistatic genes with strong context-dependent environmental effects makes it possible to imagine that some diseases may have a genetic component but be truly unyielding by the proposed methods. But even in this worst-case scenario, we already know that not all complex diseases are this ill behaved, so the problem can be restated as finding efficient

References and recommended reading

Papers of particular interest, published within the annual period of review, have been highlighted as:

•
of special interest
••
of outstanding interest

Acknowledgements

This work was supported by grant HG02352 from the United States National Institutes of Health.

References (57)

A.G Clark et al.
Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase
Am. J. Hum. Genet.
(1998)
S.M Fullerton et al.
Apolipoprotein E variation at the sequence haplotype level: implications for the origin and maintenance of a major human polymorphism
Am. J. Hum. Genet.
(2000)
B Devlin et al.
A comparison of linkage disequilibrium measures for fine-scale mapping
Genomics
(1995)
J.R Kidd et al.
Haplotypes and linkage disequilibrium at the phenylalanine hydroxylase locus, PAH, in a global representation of populations
Am. J. Hum. Genet.
(2000)
G Marth et al.
Sequence variations in the public human genome data reflect a bottlenecked population history
Proc. Natl. Acad Sci. USA
(2003)
N.H Barton et al.
Understanding quantitative genetic variation
Nat. Rev. Genet.
(2002)
M Nordborg et al.
Linkage disequilibrium: what history has to tell us
Trends Genet.
(2002)
D.E Reich et al.
On the allelic spectrum of human disease
Trends Genet.
(2001)
J.K Pritchard
Are rare variants responsible for susceptibility to complex disease?
Am. J. Hum. Genet.
(2001)
J.K Pritchard et al.
The allelic architecture of human disease genes: common disease — common variant… or not?
Hum. Mol. Genet.
(2002)

L Partridge et al.

Optimality, mutation and the evolution of ageing

Nature

(1993)

M Hamblin et al.

Complex signatures of natural selection at the Duffy blood group locus

Am. J. Hum. Genet.

(2002)

Y.X Fu et al.

Statistical tests of neutrality of mutations

Genetics

(1993)

A.G Clark

Inference of haplotypes from PCR-amplified samples of diploid populations

Mol. Biol. Evol.

(1990)

K.L Mohlke et al.

High-throughput screening for evidence of association by using mass spectrometry genotyping on DNA pools

Proc. Natl. Acad Sci. USA

(2002)

N Risch et al.

The future of genetic studies of complex human diseases

Science

(1996)

J Hastbäcka et al.

Linkage disequilibrium mapping in isolated founder populations: diastrophic dysplasia in Finland

Nat. Genet.

(1992)

W.G Hill et al.

Maximum-likelihood estimation of gene location by linkage disequilibrium

Am. J. Hum. Genet.

(1994)

K.G Ardlie et al.

Patterns of linkage disequilibrium in the human genome

Nat. Rev. Genet.

(2002)

D.A Nickerson et al.

DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene

Nat. Genet.

(1998)

D.E Reich et al.

Linkage disequilibrium in the human genome

Nature

(2001)

S.B Gabriel et al.

The structure of haplotype blocks in the human genome

Science

(2002)

E Dawson et al.

A first-generation linkage disequilibrium map of human chromosome 22

Nature

(2002)

G Barbujani et al.

An apportionment of human DNA diversity

Proc. Natl. Acad Sci. USA

(1997)

C Romualdi et al.

Patterns of human diversity, within and among continents, inferred from biallelic DNA polymorphisms

Genome Res.

(2002)

L.B Jorde

Linkage disequilibrium and the search for complex disease genes

Genome Res.

(2000)

J.K Pritchard et al.

Case-control studies of association in structured or admixed populations

Theor. Popul. Biol.

(2001)

J.K Pritchard et al.

Linkage disequilibrium in humans: models and data

Am. J. Hum. Genet.

(2001)

Cited by (70)

A simulation study to examine the impact of recombination on phylogenomic inferences under the multispecies coalescent model
2022, Molecular Ecology
Inferring the distribution of selective effects from a time inhomogeneous model
2019, PLoS ONE
Overview of Genotyping
2012, Molecular Analysis and Genome Discovery: Second Edition
Genetic linkage studies
2011, An Introduction to Genetic Epidemiology
Feature selection for single nucleotide polymorphisms based on muti-group genetic algorithm
2010, Sichuan Daxue Xuebao (Gongcheng Kexue Ban)/Journal of Sichuan University (Engineering Science Edition)
Methodological challenges of genome-wide association analysis in Africa
2010, Nature Reviews Genetics

View all citing articles on Scopus

View full text

Finding genes underlying risk of complex disease by linkage disequilibrium mapping

Abstract

Introduction

Section snippets

Directly testing disease association

Linkage disequilibrium across the genome

Population subdivision, demography and linkage disequilibrium

Complex disorders are not simple

Models for the genetics of complex disorders

Selection in the human genome

Why HapMap?

Conclusions

References and recommended reading

Acknowledgements

Am. J. Hum. Genet.

Am. J. Hum. Genet.

Genomics

Am. J. Hum. Genet.

Proc. Natl. Acad Sci. USA

Nat. Rev. Genet.

Trends Genet.

Trends Genet.

Am. J. Hum. Genet.

Hum. Mol. Genet.

Nature

Am. J. Hum. Genet.

Genetics

Mol. Biol. Evol.

Proc. Natl. Acad Sci. USA

The future of genetic studies of complex human diseases

Science

Linkage disequilibrium mapping in isolated founder populations: diastrophic dysplasia in Finland

Nat. Genet.

Maximum-likelihood estimation of gene location by linkage disequilibrium

Am. J. Hum. Genet.

Patterns of linkage disequilibrium in the human genome

Nat. Rev. Genet.

DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene

Nat. Genet.

Linkage disequilibrium in the human genome

Nature

The structure of haplotype blocks in the human genome

Science

A first-generation linkage disequilibrium map of human chromosome 22

Nature

An apportionment of human DNA diversity

Proc. Natl. Acad Sci. USA

Patterns of human diversity, within and among continents, inferred from biallelic DNA polymorphisms

Genome Res.

Linkage disequilibrium and the search for complex disease genes

Genome Res.

Case-control studies of association in structured or admixed populations

Theor. Popul. Biol.

Linkage disequilibrium in humans: models and data

Am. J. Hum. Genet.