Article Text

PDF
Original article
Quantitative trait locus analysis for next-generation sequencing with the functional linear models
  1. Li Luo1,
  2. Yun Zhu2,
  3. Momiao Xiong2
  1. 1Division of Epidemiology, Biostatistics and Preventive Medicine, University of New Mexico, Albuquerque, New Mexico, USA
  2. 2Department of Biostatistics, Human Genetics Center, The University of Texas School of Public Health, Houston, Texas, USA
  1. Correspondence to Dr Momiao Xiong, Department of Biostatistics, Human Genetics Center, The University of Texas Health Science Center at Houston, PO Box 20186, Houston, TX 77225, USA; Momiao.Xiong{at}uth.tmc.edu

Abstract

Background Although in the past few years we have witnessed the rapid development of novel statistical methods for association studies of qualitative traits using next generation sequencing (NGS) data, only a few statistics are proposed for testing the association of rare variants with quantitative traits. The quantitative trait locus (QTL) analysis of rare variants remains challenging. Analysis from low dimensional data to high dimensional genomic data demands changes in statistical methods from multivariate data analysis to functional data analysis.

Methods We propose a functional linear model (FLM) as a general principle for developing novel and powerful QTL analysis methods designed for resequencing data. By simulations we calculated the type I error rates and evaluated the power of the FLM and other eight existing statistical methods, even in the presence of both positive and negative signs of effects.

Results Since the FLM retains all of the genetic information in the data and explores the merits of both variant-by-variant and collective analysis and overcomes their limitation, the FLM has a much higher power than other existing statistics in all the scenarios considered. To evaluate its performance further, the FLM was applied to association analysis of six quantitative traits in the Dallas Heart Study, and RNA-seq eQTL analysis with genetic variation in the low coverage resequencing data of the 1000 Genomes Project. Real data analysis showed that the FLM had much smaller p values to identify significantly associated variants than other existing methods.

Conclusions The FLM is expected to open a new route for QTL analysis.

Statistics from Altmetric.com

Footnotes

  • ▸ Additional data are published online only. To view these files please visit the journal online (http://dx.doi.org/10.1136/jmedgenet-2012-100798)

  • Funding The project described was supported by Grant 1R01AR057120—01, 1R01HL106034-01, P01 AR052915-01A1 and P50 AR054144-01 CORT from National Institutes of Health and NIAMS.

  • Competing interests None.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.