Haplotype and Linkage Disequilibrium Architecture for Human Cancer-Associated Genes

  1. Penelope E. Bonnen1,
  2. Peggy J. Wang1,
  3. Marek Kimmel2,
  4. Ranajit Chakraborty3, and
  5. David L. Nelson1,4
  1. 1Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA; 2Department of Statistics, Rice University, Houston, Texas 77030, USA; 3Center for Genome Information, Department of Environmental Health, University of Cincinnati, Cincinnati, Ohio 45267, USA

Abstract

To facilitate association-based linkage studies we have studied the linkage disequilibrium (LD) and haplotype architecture around five genes of interest for cancer risk: ATM, BRCA1,BRCA2, RAD51, and TP53. Single nucleotide polymorphisms (SNPs) were identified and used to construct haplotypes that span 93–200 kb per locus with an average SNP density of 12 kb. These markers were genotyped in four ethnically defined populations that contained 48 each of African Americans, Asian Americans, Hispanic Americans, and European Americans. Haplotypes were inferred using an expectation maximization (EM) algorithm, and the data were analyzed using D‘, R 2, Fisher’s exactP-values, and the four-gamete test for recombination. LD levels varied widely between loci from continuously high LD across 200 kb to a virtual absence of LD across a similar length of genome. LD structure also varied at each gene and between populations studied. This variation indicates that the success of linkage-based studies will require a precise description of LD at each locus and in each population to be studied. One striking consistency between genes was that at each locus a modest number of haplotypes present in each population accounted for a high fraction of the total number of chromosomes. We conclude that each locus has its own genomic profile with regard to LD, and despite this there is the widespread trend of relatively low haplotype diversity. As a result, a low marker density should be adequate to identify haplotypes that represent the common variation at a locus, thereby decreasing costs and increasing efficacy of association studies.

[Supplemental material is available online at http://www.genome.org.]

Footnotes

  • 4 Corresponding author.

  • E-MAIL nelson{at}bcm.tmc.edu; FAX (713) 798-5386.

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.483802. Article published online before print in November 2002.

    • Received May 31, 2002.
    • Accepted September 12, 2002.
| Table of Contents

Preprint Server