Article Text

PDF
An empirical comparison of case-control and trio based study designs in high throughput association mapping
  1. P Hintsanen1,
  2. P Sevon1,
  3. P Onkamo2,
  4. L Eronen1,
  5. H Toivonen1
  1. 1Helsinki Institute for Information Technology, Basic Research Unit, Department of Computer Science, University of Helsinki, Finland
  2. 2Department of Biological and Environmental Sciences, University of Helsinki, Finland
  1. Correspondence to:
 Dr H Toivonen
 Department of Computer Science, Gustaf Hällströmin katu 2b, PO Box 68, FI-00014 University of Helsinki, Finland; hannu.toivonen{at}cs.helsinki.fi

Abstract

Motivated by high throughput genotyping technology, our aim in this study was to experimentally compare the power and accuracy of case-control and family trio based approaches for haplotype based, large scale, association gene mapping. We compared trio based and case-control study designs in different disease models, and partitioned the performance differences into separate components: those from the sample ascertainment, the effective sample size, and the haplotyping approaches. For systematic and controlled tests, we simulated a rapidly expanding and relatively young isolated population. The experiments were also replicated with real asthma data. We used computationally efficient methods that scale up to large amounts of both markers and individuals. Mapping is based on a haplotype association test for haplotypes of 1–10 markers. For population based haplotype reconstruction, we use HaploRec, and compare it to both a simple trio based inference and true haplotypes. Firstly and surprisingly, statistically inferred population based haplotypes can be equally powerful as true haplotypes. Secondly, as expected, the effective sample size has a clear effect on both gene detection power and mapping accuracy. Thirdly, the sample ascertainment method does not have much effect on mapping accuracy. Finally, an interesting side result is that the simple haplotype association test clearly outperformed exhaustive allelic transmission disequilibrium tests. The results suggest that the case-control design is a powerful alternative to the more laborious family based ascertainment approach, especially for large datasets, and wherever population stratification can be controlled.

  • EATDT, exhaustive allelic transmission disequilibrium tests
  • SNP, single nucleotide polymorphism
  • linkage disequilibrium mapping
  • population-based haplotyping
  • haplotype association
  • case-control data
  • high throughput methods

Statistics from Altmetric.com

Footnotes

  • Published Online First 28 October 2005

  • This research has been supported by Tekes, the National Technology Agency of Finland.

  • Competing interests: there are no competing interests

  • Data access: HaploRec software for population based reconstruction of haplotypes and the simulated datasets are available at www.cs.helsinki.fi/group/genetics/.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.