Annotation of functional variation in personal genomes using RegulomeDB

  1. Michael Snyder1,3
  1. 1Department of Genetics, Stanford University School of Medicine, Stanford, California 94305, USA;
  2. 2Department of Computer Science, Stanford University, Stanford, California 94305, USA

    Abstract

    As the sequencing of healthy and disease genomes becomes more commonplace, detailed annotation provides interpretation for individual variation responsible for normal and disease phenotypes. Current approaches focus on direct changes in protein coding genes, particularly nonsynonymous mutations that directly affect the gene product. However, most individual variation occurs outside of genes and, indeed, most markers generated from genome-wide association studies (GWAS) identify variants outside of coding segments. Identification of potential regulatory changes that perturb these sites will lead to a better localization of truly functional variants and interpretation of their effects. We have developed a novel approach and database, RegulomeDB, which guides interpretation of regulatory variants in the human genome. RegulomeDB includes high-throughput, experimental data sets from ENCODE and other sources, as well as computational predictions and manual annotations to identify putative regulatory potential and identify functional variants. These data sources are combined into a powerful tool that scores variants to help separate functional variants from a large pool and provides a small set of putative sites with testable hypotheses as to their function. We demonstrate the applicability of this tool to the annotation of noncoding variants from 69 full sequenced genomes as well as that of a personal genome, where thousands of functionally associated variants were identified. Moreover, we demonstrate a GWAS where the database is able to quickly identify the known associated functional variant and provide a hypothesis as to its function. Overall, we expect this approach and resource to be valuable for the annotation of human genome sequences.

    Footnotes

    • 3 Corresponding author

      E-mail mpsnyder{at}stanford.edu

    • [Supplemental material is available for this article.]

    • Article and supplemental material are at http://www.genome.org/cgi/doi/10.1101/gr.137323.112.

      Freely available online through the Genome Research Open Access option.

    • Received January 5, 2012.
    • Accepted May 2, 2012.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.

    Related Articles

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server