Table 4

Paralogue annotation accurately identifies disease-associated residues in RYR2 and SCN5A

Protein		Pathogenic	Benign	Uncertain	Unannotated	Total
SCN5A	Published observations	368	28	60	1560	2016
SCN5A	Paralogue annotations: observed (expected)	113 (81)	1 (6)	6 (13)	321 (341)	441
RYR2	Published observations	139	20	71	4737	4967
	Paralogue annotations: observed (expected)	35 (8)	1 (1)	1 (4)	238 (262)	275

Distribution of paralogue annotations across the amino acid residues of the SCN5A and RYR2 proteins. ‘Published observations’ shows the number of amino acid residues with known missense variants, classified as pathogenic, benign or uncertain, and the number of residues at which missense variation has not previously been observed (unannotated). ‘Uncertain’ refers to dbSNP variants without clinical information or residues with variants with conflicting reports as to pathogenicity. ‘Paralogue annotations observed’ shows the number of residues of each class that are annotated by variants in paralogues, and which would therefore be expected to be sites of pathogenic variation. ‘Paralogue annotations expected’ shows the number of residues in each class that would be expected to be annotated if paralogue annotation was random, with no predictive value. Variants annotated in this way are highly enriched for pathogenic variation in both genes (2×2 Fisher's exact test p=0.0009), with a positive predictive value (PPV) of 98.7%. 559 previously unannotated residues (321 in SCN5A, 238 in RYR2) are identified as putative disease-associated residues.