Article Text

Download PDFPDF
Communications
The performance of deleteriousness prediction scores for rare non-protein-changing single nucleotide variants in human genes
  1. Xiaoming Liu1,
  2. Chang Li1,
  3. Eric Boerwinkle1,2
  1. 1Human Genetics Center and Department of Epidemiology, Human Genetics and Environmental Sciences, UTHealth School of Public Health, Houston, Texas, USA
  2. 2Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, USA
  1. Correspondence to Dr Xiaoming Liu, Human Genetics Center and Department of Epidemiology, Human Genetics and Environmental Sciences, UTHealth School of Public Health, Houston, Texas 77030, USA; Xiaoming.Liu{at}uth.tmc.edu

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

In recent years, whole genome sequencing has increasingly been used as a replacement of whole exome sequencing for identifying causal variants of human diseases. In response to this trend, several ‘genome-level’ deleteriousness prediction scores have been proposed to implement the scores designed specifically for missense or splicing variants. The aim of this study was to investigate the prediction accuracy of those genome-level scores for rare non-protein-changing single nucleotide variants (npcSNVs) in and near human genes. We compared 15 genome-level deleteriousness prediction scores and eight conservation scores using receiver operating characteristic (ROC) and area under curve (AUC). We found that fathmm-MKL coding score1 was the best score for npcSNVs (AUC=0.875), outperforming other genome-level deleteriousness prediction scores and conservation scores.

As the cost of whole genome and exome sequencing has reduced considerably, clinical use of sequencing data is becoming more popular. Even though more candidate SNVs were identified and reported, interpretation of these variants accurately in a clinical context remains a challenge. To facilitate and standardise the interpretation of sequence variants, the American College of Medical Genetics and Genomics (ACMG) has developed a new five-tier, evidence-based guideline.2 As a major component of this guideline, in silico prediction of variant deleteriousness has been widely used in screening and prioritising candidate variants from a large number of background variants.

Multiple algorithms exist to predict variant deleteriousness based on different properties of the variant. Previously, most algorithms focused on SNVs altering amino acid or splicing. …

View Full Text

Footnotes

  • Contributors XL designed the study and collected the annotation resources. CL performed the comparison. EB and XL supervised the study. XL and EB wrote the draft manuscript.

  • Funding This study was supported by the US National Institutes of Health (5RC2HL102419 and U54HG003273).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.