Cell
Volume 176, Issue 3, 24 January 2019, Pages 535-548.e24
Journal home page for Cell

Article
Predicting Splicing from Primary Sequence with Deep Learning

https://doi.org/10.1016/j.cell.2018.12.015Get rights and content
Under an Elsevier user license
open archive

Highlights

  • SpliceAI, a 32-layer deep neural network, predicts splicing from a pre-mRNA sequence

  • 75% of predicted cryptic splice variants validate on RNA-seq

  • Cryptic splicing may yield ∼10% of pathogenic variants in neurodevelopmental disorders

  • Cryptic splice variants frequently give rise to alternative splicing

Summary

The splicing of pre-mRNAs into mature transcripts is remarkable for its precision, but the mechanisms by which the cellular machinery achieves such specificity are incompletely understood. Here, we describe a deep neural network that accurately predicts splice junctions from an arbitrary pre-mRNA transcript sequence, enabling precise prediction of noncoding genetic variants that cause cryptic splicing. Synonymous and intronic mutations with predicted splice-altering consequence validate at a high rate on RNA-seq and are strongly deleterious in the human population. De novo mutations with predicted splice-altering consequence are significantly enriched in patients with autism and intellectual disability compared to healthy controls and validate against RNA-seq in 21 out of 28 of these patients. We estimate that 9%–11% of pathogenic mutations in patients with rare genetic disorders are caused by this previously underappreciated class of disease variation.

Keywords

splicing
genetics
artificial intelligence
deep learning

Cited by (0)

6

These authors contributed equally

7

Lead Contact