Article Text
Statistics from Altmetric.com
DNA sequencing technologies continue to make progress in increased throughput and quality, and decreased cost. As we transition from whole exome capture sequencing to whole genome sequencing (WGS), our ability to convert machine-generated variant calls, including single nucleotide variant (SNV) and insertion-deletion variants (indels), into human-interpretable knowledge has lagged far behind the ability to obtain enormous amounts of variants. To help narrow this gap, here we present WGSA (WGS annotator), a functional annotation pipeline for human genome sequencing studies, which is runnable out of the box on the Amazon Compute Cloud and is freely downloadable at (https://sites.google.com/site/jpopgen/wgsa/).
Functional annotation is a key step in WGS analysis. In one way, annotation helps the analyst filter to a subset of elements of particular interest (eg, cell type specific enhancers), in another way annotation helps the investigators to increase the power of identifying phenotype-associated loci (eg, association test using functional prediction score as a weight) and interpret potentially interesting findings. Currently, there are several popular gene model based annotation tools, including ANNOVAR,1 SnpEff2 and the Ensembl Variant Effect Predictor (VEP).3 These can annotate a variety of protein coding and non-coding gene models from a range of species. It is well known among practitioners that different databases (eg, RefSeq4 and Ensembl5) use different models for …
Footnotes
Contributors XL, ADJ, JAB, AHL, AC, PW, ZH, RJK and EB designed the study. XL collected the annotation resources and developed the tool. SW tested the pipeline. BP provided tools for retrieving the RegulomeDB data set. EB and RG supervised the study. XL, SW and EB wrote the draft manuscript and all authors provided critical edits.
Funding This study was supported by the US National Institutes of Health (5RC2HL102419 and U54HG003273).
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.