Article Text

Original article
Identification of pathogenic gene variants in small families with intellectually disabled siblings by exome sequencing
  1. Janneke H M Schuurs-Hoeijmakers1,2,3,
  2. Anneke T Vulto-van Silfhout1,2,4,
  3. Lisenka E L M Vissers1,2,3,
  4. Ilse I G M van de Vondervoort1,
  5. Bregje W M van Bon1,
  6. Joep de Ligt1,2,3,
  7. Christian Gilissen1,2,3,
  8. Jayne Y Hehir-Kwa1,2,3,
  9. Kornelia Neveling1,2,3,
  10. Marisol del Rosario1,
  11. Gausiya Hira1,
  12. Santina Reitano4,
  13. Aurelio Vitello4,
  14. Pinella Failla4,
  15. Donatella Greco4,
  16. Marco Fichera4,5,
  17. Ornella Galesi4,
  18. Tjitske Kleefstra1,2,3,
  19. Marie T Greally6,
  20. Charlotte W Ockeloen1,
  21. Marjolein H Willemsen1,2,3,
  22. Ernie M H F Bongers1,2,3,
  23. Irene M Janssen1,
  24. Rolph Pfundt1,
  25. Joris A Veltman1,2,3,
  26. Corrado Romano4,
  27. Michèl A Willemsen7,8,
  28. Hans van Bokhoven1,3,8,
  29. Han G Brunner1,2,3,
  30. Bert B A de Vries1,2,8,
  31. Arjan P M de Brouwer1,3,8
  1. 1Department of Human Genetics, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
  2. 2Institute for Genetic and Metabolic Disease, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
  3. 3Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
  4. 4Unit of Pediatrics and Medical Genetics, Unit of Neurology, Laboratory of Medical Genetics IRCCS Associazione Oasi Maria Santissima, Troina, Italy
  5. 5Department of Medical Genetics, University of Catania, Catania, Italy
  6. 6National Centre for Medical Genetics, Our Lady's Children's Hospital, Crumlin, Dublin, Ireland
  7. 7Departments of Pediatric Neurology, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
  8. 8Donders Institute for Brain, Cognition and Behavior, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
  1. Correspondence to Dr Bert B A de Vries, Department of Human Genetics 836, Radboud University Nijmegen Medical Centre, PO Box 9101, Nijmegen NL-6500 HB, The Netherlands; B.devries{at}gen.umcn.nl

Abstract

Background Intellectual disability (ID) is a common neurodevelopmental disorder affecting 1–3% of the general population. Mutations in more than 10% of all human genes are considered to be involved in this disorder, although the majority of these genes are still unknown.

Objectives We investigated 19 small non-consanguineous families with two to five affected siblings in order to identify pathogenic gene variants in known, novel and potential ID candidate genes. Non-consanguineous families have been largely ignored in gene identification studies as small family size precludes prior mapping of the genetic defect.

Methods and results Using exome sequencing, we identified pathogenic mutations in three genes, DDHD2, SLC6A8, and SLC9A6, of which the latter two have previously been implicated in X-linked ID phenotypes. In addition, we identified potentially pathogenic mutations in BCORL1 on the X-chromosome and in MCM3AP, PTPRT, SYNE1, and ZNF528 on autosomes.

Conclusions We show that potentially pathogenic gene variants can be identified in small, non-consanguineous families with as few as two affected siblings, thus emphasising their value in the identification of syndromic and non-syndromic ID genes.

  • Genetics
View Full Text

Statistics from Altmetric.com

Introduction

Intellectual disability (ID), previously known as mental retardation, is a common disorder affecting 1–3% of the population in Western countries.1 It is clinically defined by significant limitations both in intellectual functioning—IQ below 70—and in adaptive behaviours that manifest before the age of 18 years.2 Genetic causes of syndromic and non-syndromic ID are diverse and include chromosomal abnormalities, imprinting disorders, single gene mutations, and mitochondrial DNA mutations. Recent studies have indicated that autosomal de novo mutations that affect single genes account for 16–45% of individuals with severe ID,3 ,4 while mutations in X-linked genes account for 10–15% of males with ID.5 The contribution of recessive mutations to ID is considered to be as high as 25%6–,9 and only a minority of these autosomal recessive ID (ARID) genes have been identified so far. One study has systematically assessed multiple families with non-specific ARID by using massive parallel sequencing techniques.10 In that study, conventional homozygosity mapping in large consanguineous Iranian families followed by sequencing of all genes in the homozygous linkage regions resulted in the identification of several novel ID genes.

In Western populations, ID families are usually non-consanguineous with often only one or two affected siblings. Families with multiple affected siblings comprise 6% of more than 4000 families in our in-clinic ID cohort. Identification of the genetic defect in this considerable group of small families could contribute significantly to the identification of ARID genes.

Here, we studied one exome per family with multiple affected siblings who had either non-syndromic (n=7) or syndromic ID (n=12) to determine the underlying genetic defect. Our approach not only resulted in the identification of several candidate genes for ID, but also provided a molecular diagnosis for the families in which a definite recessive or X-linked pathogenic mutation was found.

Families and methods

Families

Clinical information and peripheral blood samples were collected from all affected individuals and, wherever possible, from parents and unaffected family members (see online supplementary table S1). All but one family had non-consanguineous parents and all had at least two affected siblings: 13 families consisted of affected brother–sister pairs (2–5 affected siblings per family), three families of affected sister–sister pairs (2–3 affected siblings), and three families of affected brother–brother pairs (2–3 affected siblings). In four families, there were also healthy siblings (for clinical description see online supplementary table S1). Previous clinical and molecular evaluation by conventional karyotyping and array comparative genomic hybridisation (CGH) had not provided an aetiological diagnosis for these families. Pathogenic mutations in SLC6A8, SCL9A6, and DDHD2 were reported back to the respective families and they were counselled accordingly. Written informed consent was obtained for all participating families. Our research project was approved by the local ethics committee (Commissie Mensgebonden Onderzoek Regio Arnhem-Nijmegen) according to the World Medical Association Declaration of Helsinki.

Library preparation for exome sequencing

Exome sequencing was performed on genomic DNA of one affected individual from each of the 19 families. Exome enrichment required 3 μg genomic DNA which was captured with an AB SOLiD optimised SureSelect 50 Mb human exome kit (Agilent, Santa Clara, California, USA) representing exonic sequences for ∼21 000 genes (including >99% genes of the CCDS version September 2009 and >95% of RefSeq genes and transcripts version June 2010—as specified by the company). Manufacturer's instructions (V.1.5) for enrichment were followed with reduction of the number of post-hybridisation ligation mediated PCR cycles from 12 to nine cycles. To allow for multiplexing libraries before sequencing, post-hybridisation sample barcodes were used (Agilent, Santa Clara), compliant with SOLiD sequencing technology.

SOLiD sequencing and mapping

Enriched exome libraries were equimolarly pooled in sets of four, based on a combined library concentration of 0.7 pM. Subsequently, the obtained pool was used for emulsion PCR and bead preparation using the EZbead system, following the manufacturer's instructions (V.05/2010; Life Technologies, Carlsbad, California, USA). For each pool of four exome libraries, a full sequencing slide was used on a SOLiD 4 System (Life Technologies, Carlsbad), thereby anticipating that all four samples would be represented by 25% of the total beads sequenced on the slide. Colour space reads were mapped to the hg19 reference genome with the SOLiD bioscope software V.1.3.

Calling and prioritisation of sequence variants

Sequence variants, including indel variations, were selected using quality settings that required the presence of at least two unique variant reads, in addition to the variation being present in at least 20% of all reads. Non-genic, intronic, and synonymous sequence variants (but not canonical splice sites) were excluded as well as gene variants found in >1% of the population based on dbSNPv134 and our local variant database which consists of data derived from 672 exome experiments. Under the hypothesis of a recessive inheritance model, we selected all rare variants in autosomal genes, which were present either in the presumed homozygous state (>80% variant reads), or in the presumed compound heterozygous state in genes containing at least two rare variants (>20% variation reads) (see online supplementary figure S1). For brother–brother pair families (n=3), possible hemizygous sequence variants (>80% variant reads) on the X chromosome were also considered as candidate mutations. Special attention was given to all rare variants regardless of the percentage variant reads in genes that have ID as a phenotypic feature, as described in the OMIM clinical synopsis or OMIM clinical features (http://www.omim.org). Accurate indel variation detection and calling, by use of massive parallel sequencing techniques, is challenging and small indel variations suffer high false positive calls due to sequencing errors, alignment mistakes, and variant calling mistakes. Therefore, we manually checked the raw sequence reads of each variant by use of the Integrative Genomics Viewer.11 Candidate indel variations with a percentage of variant reads below 80% for the homozygous candidates and below 20% for the heterozygous candidates were excluded from further analyses after manual inspection of the sequence reads.

Sanger sequencing of candidate rare recessive sequence variants

To validate the presence of the candidate variants and to test for segregation with the disease phenotype within the families, primers for all candidate rare recessive sequence variants were designed using Primer 3 (V.0.4.0)12 and obtained from Invitrogen (Invitrogen, Life Technologies, Paisley, UK). The proband and available family members were sequenced for candidate rare recessive sequence variants. PCR products were sequenced using the ABI PRISM BigDye Terminator Cycle Sequencing V2.0 Ready Reaction Kit and analysed with the ABI PRISM 3730 DNA analyser (Applied Biosystems, Foster City, California, USA). (Primer sequences and PCR conditions are available upon request.)

Exome copy number analysis and genomic real-time quantitative PCR analysis for detection of homozygous deletions

Exome copy number analysis was performed by use of cn.MOPS.13 Samples were analysed with a reference set of 30 exomes for comparison. The threshold for detection of homozygous deletions was set at Z-score <−2.5. To validate the presence of candidate homozygous deletions, a SYBR Green-based real-time quantitative PCR (QPCR) analysis was performed on genomic DNA on the proband and available family members. Primers were designed by using Primer 3 (V.0.4.0).12 (Primer sequences available upon request.) Genomic QPCR was performed on a 7900 Fast Real-Time PCR System (Applied Biosystems) by using GoTaq qPCR Master Mix (Promega, Madison, Wisconsin, USA) according to the manufacturer’s instructions. QPCR quantifications were performed in duplicate on 10 ng of genomic DNA and included a water control. Experimental threshold cycles (Ct) values were within the range of DNA dilutions used to validate the primers. The melt curves of all PCR products showed a single PCR product. All water controls were negative. Copy numbers were measured relative to SLC16A2 (NM_006517.3) on the X chromosome. Differences in copy number of a genomic sequence of interest between the individual test sample and a control sample were calculated by the comparative Ct or 2ΔΔCt method,14 ,15 after correction for gender.

Results

Identification of candidate recessive variants by exome sequencing

Nineteen individuals with presumed autosomal recessive or X-linked ID were assessed by exome sequencing. We obtained on average 5.1 Gb of mappable sequence data per exome, resulting in an average 58-fold coverage. More than 85% of the targets were covered more than 10 times (see online supplementary figure S2 and table S2). Comparison of sequence reads to the reference genome (GRCh37/hg19) showed between 23 212 and 32 149 sequence variants per exome (see online supplementary table S2). Exclusion of non-genic, intronic, and synonymous sequence variants (but not canonical splice sites), as well as gene variants found in >1% of the population, resulted in an average of 230 rare variants per exome (range 158–413). Further selection for recessive variants revealed an average of five candidate homozygous variants (range 1–9) and eight candidate compound heterozygous variants (in four genes, range 1–10 genes) per exome. This is about half the number of candidate variants that were identified upon exome sequencing of an affected individual in a consanguineous family using the same approach (Z Iqbal, personal communication, in 2013). For the brother–brother pair families (n=3), we also considered an X-linked mode of inheritance. This resulted in an average of five candidate variants on the X chromosome (range 4–5).

Confirmation of candidate recessive variants and segregation analysis

For all selected homozygous and compound heterozygous variants, we performed Sanger sequencing to confirm the presence of the variant in the proband and to analyse its segregation within the family. Of note, 47 candidate variants had a 2–4x variant coverage of which 14 (30%) were confirmed as homozygous and 22 (47%) as heterozygous by Sanger sequencing, indicating that our cut-off of ≥2 variant reads for variant selection is crucial in order not to miss potential candidate mutations (see online supplementary table S3). In total, we found four genes with homozygous variants, eight genes with compound heterozygous variants, and six genes with hemizygous variants, which segregated with the disease phenotype within the 19 families (table 1 and see online supplementary table S4). Consanguinity was not reported in any of the families, and also the percentage of homozygosity that was detected with 250 k SNP analysis did not suggest consanguinity. Nevertheless, four homozygous gene variants were detected.

Table 1

Summary of (candidate) syndromic and non-syndromic intellectual disability genes and their respective mutations that were identified in the investigated families

Detection of homozygous deletions

In addition to variant analysis at the base-pair level, we performed copy number variation analysis on the mapped exome data by using cn.MOPS,13 for detection of homozygous deletions. This analysis showed three potential homozygous deletions, which were followed-up by genomic quantitative QPCR to validate the presence of the homozygous deletion in the proband and to test segregation with the disease phenotype. All three homozygous deletions were present in the proband but none segregated with the phenotype in the families and were therefore not regarded as pathogenic (see online supplementary figure S3 and table S5).

Variant classification

Variants that segregated with the disease phenotype were classified as pathogenic if they resided either in a known ID gene or in a novel gene in which a second pathogenic mutation was identified in a patient with a similar phenotype, upon further study. This was the case for three of the 19 families (figure 1). If there was no such finding, variants were classified as either potentially pathogenic or likely benign. Variants were labelled potentially pathogenic if they fulfilled four criteria: (1) the gene is already linked to a human neurologic phenotype or is not at all linked to a human phenotype as described in the online OMIM database (http://www.ncbi.nlm.nih.gov/OMIM/); (2) there is mRNA expression of the respective gene in brain/neuronal tissue according to the expressed sequence tags database (http://www.ncbi.nlm.nih.gov/dbEST/); (3) the effect of missense variant(s) is predicted to be ‘disease causing/damaging’ by either SNPs&Go16 and/or PolyPhen-217; (4) the altered amino acid is conserved among vertebrates (see figure 2 for a progressive scheme). Variants that did not fulfil all criteria were classified as likely benign (table 1, see online supplementary table S4). Variants classified as potentially pathogenic (figure 3) were found in five of the 19 families reported in this study.

Figure 1

Study design and pathogenic mutations in known and novel genes for intellectual disability. (A) Study design: (1) family selection represented by the pedigree of family W08-0135; (2) exome sequencing and variants selection resulting in candidate mutations c.1639G>T, in SLC9A6, represented by the raw sequence reads of our candidate mutation; (3) validation and segregation testing by Sanger sequencing, depicted by the chromatograms showing the mutant sequence that was present in both affected male individuals and the wild type sequence present in a control individual; (4) variant interpretation (see figure 2 for a progressive variant interpretation scheme and table 1 and online supplementary table S4 for further details of individual identified variants). (B) Pedigree of family W10-1338, showing segregation of the compound heterozygous mutations, c.1804_1805insT and c.2057delA in DDHD2. (C) Pedigree of family W10-2749, showing segregation of the hemizygous amino acid deletion, c.1005_1007delCAA, in SLC6A8. Arrows indicate the probands. M, mutation present; M1 and M2 are the two different alleles of compound heterozygous mutations; -, normal allele present; P, patient; C, control individual.

Figure 2

Flow diagram showing step-by-step variant classification of segregating recessive sequence variants. Rare gene variants (non-synonymous and protein truncating) that segregated with the intellectual disability (ID) phenotype were classified as pathogenic if they resided either in a known syndromic or non-syndromic ID gene, or in a novel gene in which a second pathogenic mutation was identified in a patient with a similar phenotype upon further study. Variants were labelled potentially pathogenic if they fulfilled four criteria: (1) the gene is already linked to a human neurologic phenotype or is not at all linked to a human phenotype as described in the OMIM database (http://www.ncbi.nlm.nih.gov/OMIM/); (2) there is mRNA expression of the respective gene in brain/neuronal tissue according to the expressed sequence tags database (http://www.ncbi.nlm.nih.gov/dbEST/); (3) effect of missense variant(s) is predicted to be ‘disease causing/damaging’ by either SNPs&Go16 and/or PolyPhen-217; (4) the altered amino acid is conserved among vertebrates. Variants that did not fulfil these four criteria where classified as likely benign. CNS, central nervous system.

Figure 3

Segregation of potentially pathogenic missense variants. (A) Pedigree of family W05-385 with Sanger confirmation of homozygous variant c.2743G>A (M) in MCM3AP. (B) Pedigree of family W07-1601 with Sanger confirmation of the hemizygous variant c.2459A>G (M) in BCORL1. (C) Pedigree of family W09-1109 with Sanger confirmation of variant c.4094C>T in PTPRT. Individuals in grey have borderline intellectual functioning. Bottom panel shows the 150 kb deletion on chromosome 20 in the proband, .arr snp 20q12q13.11(SNP_A-2168377->SNP_A-4194425)x1, detected with 250k SNP array testing. (D) Pedigree of family W10-1137 with Sanger confirmation of variants, c.1964A>G (M1), c.9262G>A (M2) and c.11675T>C (M3) in SYNE1. (E) Pedigree of family W11-3472 with heterozygous variants c.193T>G (M1) and c.1034G>A (M2) in ZNF582.

Pathogenic mutations in SLC9A6, SLC6A8, and DDHD2

In two of the three brother–brother pair families, we identified pathogenic mutations in known X-chromosomal ID genes, SLC9A6 and SLC6A8 (figure 1A,C). In SLC9A6 (OMIM 30024318), a c.1639G>T mutation, resulting in a premature stop codon, p.Glu547*, was identified in two brothers (W08-0135) with severe ID, a friendly personality, microcephaly, epilepsy, and ataxic gait. Sanger validation confirmed the presence of this mutation hemizygously in the two brothers and in the heterozygous state in their carrier mother. The observed Angelman-like phenotype in family W08-0135 is very similar to the phenotype previously described in families with mutations in SLC9A6,18 underlining the pathogenicity of this change. In SLC6A8 (OMIM 30035219) we detected a single amino acid residue deletion, c.1005_1007delCAA, in family W10-2749. This change results in the deletion of an asparagine at amino acid residue position 336 (p.Asn336del) within the sodium neurotransmitter symporter domain of SLC6A8. This same mutation has been classified as pathogenic in previous studies20 and has been proven to disrupt the transporter function of SLC6A8 in vivo.21

Family W10-2749 of Dutch origin consisted of three affected brothers, two healthy brothers, and three healthy sisters. One of the affected males died at 13 years of age. All three affected brothers presented with severe ID and epilepsy. Facial asymmetry was noted in two of the three affected brothers. Additional measurement of creatine and creatinine in the urine of both males showed an elevated creatine concentration (6459 and 3227 μmol/L for individuals 7 and 8 in figure 1C, respectively) and an elevated creatine/creatinine ratio (2084 and 1796 μmol/mmol creatinine for individuals 7 and 8, respectively), thereby confirming the molecular diagnosis at the metabolite level. Of note, manual selection of the sequence reads for the g.152958810_152958812delCAA variant showed the deletion in five of the 29 reads, an unexpected finding since males have only one X chromosome. However, SLC6A8 is located in a segmental duplication region on Xq28 that shows >90% sequence identity with two segmental duplication regions on chromosome 16p11.2 (hg19, chr16: 32 872 610–32 899 081 and chr16: 33 776 247–33 802 719). Given this high sequence homology, we suspect that either the wild type or the mutation reads were mapped incorrectly. Sanger sequencing with specific primers for SLC6A8 confirmed the hemizygous presence of this three base pair deletion in the two affected living brothers, and in the heterozygous state in their carrier mother. This example, where sequence homology interferes with correct mapping of sequence reads, highlights one of the challenges of exome data interpretation.

Pathogenic compound heterozygous frameshift mutations (c.1804_1805insT and c.2057delA, resulting in p.Thr602Ilefs*18 and p.Glu686Glyfs*35), were identified in DDHD2 in a brother and sister of a Dutch-Philippine family W10-1338 (figure 1B). The phenotype consisted of ID and spastic paraplegia as well as a thin corpus callosum and subtle periventricular white matter hyperintensities revealed by cerebral imaging. DDHD2 is one of three mammalian intracellular phospholipase A1 enzymes and is involved in organelle biogenesis and intracellular trafficking,22 ,23 a function that is shared among several genes that have been implicated in clinically similar complex phenotypes with spasticity and ID: SPG11 (SPG11, OMIM 604360), ZFYVE26 (SPG15, OMIM 270700), and AP4B1 (SPG47, OMIM 607245).24–,26 Pathogenicity of the DDHD2 mutations was supported by follow-up studies that resulted in the identification of recessive pathogenic mutations in three additional families with a comparable clinical presentation27—thus highlighting the value of detailed clinical phenotyping for selection of follow-up cohorts for variant interpretation.

Potentially pathogenic mutations in BCORL1, MCM3AP, PTPRT, SYNE1, and ZNF528

In BCORL1, a potentially pathogenic unique hemizygous missense variant, c.2459A>G (p.Asn820Ser, table 1), was identified in a brother–brother pair (W07-1601, figure 3B) with severe ID, coarse face, and hypotonia. The asparagine at position 820 is conserved among vertebrates. Of note, there were no unique autosomal variants that segregated in both boys. BCORL1 functions as a transcriptional corepressor and interacts with class II histone acetyltransferases and deacetylases (HDAC), HDAC4, HDAC5, and HDAC7 and also with CtBP corepressor protein. Transcriptional repressors such as BCOR (OMIM 300166), ARX (OMIM 300419), MECP2 (OMIM 312750), EHMT1 (OMIM 610253), and FOXG1 (OMIM 613454) have previously been implicated in ID syndromes.28–,32

A homozygous potentially pathogenic variant, c.2743G>A, was identified in MCM3AP in a brother and sister with borderline to mild ID, progressive polyneuropathy, and ptosis (W05-385, figure 3A). Parental DNA to test for heterozygosity of this variant in the parents was not available. The homozygous missense change, p.Glu915Lys, alters an amino acid that is conserved among vertebrates and is predicted to be damaging to the protein structure by Polyphen-2.17 MCM3AP encodes minichromosome maintenance 3 (MCM3) acetylating protein that binds and acetylates MCM3 and thereby inhibits cell cycle progression.

The combination of a heterozygous missense variant, c.4094C>T (p.Thr1365Met, table 1), and a heterozygous intronic deletion of 150 kb (.arr snp 20q12q13.11(SNP_A-2168377->SNP_A-4194425)x1 mat) was identified within PTPRT in a family with three affected brothers, two affected sisters, one healthy brother, and two healthy sisters (W09-1109, figure 3C). The affected individuals in this family presented with a complex phenotype consisting of severe ID, behavioural problems, microcephaly, congenital cardiac defect, and herniation of the abdominal diaphragm. PTPRT is more highly expressed in the central nervous system than in other tissues, such as bone marrow, kidney, and skin. PTPRT encodes a transmembrane receptor of the protein tyrosine phosphatase family, which are important proteins in signal transduction. The p.Thr1365Met substitution resides in the second of the two protein tyrosine phosphatase catalytic domains of the protein and affects a threonine that is conserved in vertebrates. The deletion removes more than half of intron 1, and has not been reported in healthy controls.33

In SYNE1 (OMIM 608441) we identified three heterozygous missense variants in a Sicilian brother–sister pair (W10-1137, figure 3D). Both siblings presented with mild ID, spastic paraplegia, axon neuropathy, and leukoencephalopathy, while a hypoplastic corpus callosum was observed in the female only. These mutations result in three amino acid substitutions: c.1964A>G and c.9262G>A (p.Gln655Arg and p.Ala3088Thr, table 1), which were inherited from the father; and c.11675T>C (p.Leu3892Ser, table 1), which was inherited from the mother. DNA of Sicilian population control samples was not available to test whether the variants were just ethnic-specific polymorphisms. The paternal p.Gln655Arg and maternal p.Leu3892Ser substitutions are located in two of the multiple spectrin repeats of SYNE1, and the second paternal p.Ala3088Thr substitution is in close proximity (14 amino acids) to such a repeat. SYNE1 is one of the largest human genes consisting of 146 exons, with multiple transcripts, and a notably high expression in the human central nervous system.34 It is a member of the spectrin family of structural proteins that link the plasma membrane to the actin cytoskeleton. Studies in skeletal muscle indicate a role for SYNE1 in motor neuron innervation.35 Nonsense mutations in SYNE1 have been described in autosomal recessive spinocerebellar ataxia type 8 (OMIM 610743),36 and splice site and missense mutations—the latter all located in, or in close proximity to, spectrin repeats—have been linked to autosomal dominant Emery–Dreifuss muscular dystrophy type 4, a condition with childhood onset (OMIM 612998).35 Remarkably, the parents of family W10-1137 are in good health and do not present with signs of muscular dystrophy, despite being carriers of heterozygous missense mutations in, or in close proximity to, the spectrin repeats similar to the variants in Emery–Dreifuss muscular dystrophy type 4. Furthermore a homozygous splice site mutation has been reported in one family with autosomal recessive arthrogryposis.37 All three SYNE1 variants in family W10-1137 were predicted as being probably damaging by PolyPhen-217 but predicted to be benign by SNPs&Go16 (see online supplementary table S4). Together, conservation of these amino acids within most vertebrates, their localisation within functional domains of the protein, the reported function of SYNE1 in at least peripheral neurons, and the possible involvement in different neurodegenerative and neuromuscular phenotypes, suggest that these gene variants might explain at least part of the phenotype of family W10-1137. This could be due to an additive effect of inheriting three variants in SYNE1 in a compound heterozygous state. We cannot rule out the possibility, however, that mutations in another gene, either alone or together with the SYNE1 variants, are causative factors in the phenotype observed in this family.

Lastly, in another transcriptional repressor, ZNF528, compound heterozygous missense variants c.193T>G and c.1034G>A (p.Gly345Glu and p.Trp65Gly, table 1) were identified in two sisters with mild ID and an eye movement disorder (W11-3472, figure 3E). ZNF582 is located within a zinc-finger cluster on chromosome 19q13.43 and belongs to the family of Kruppel-associated box (KRAB)-containing zinc finger proteins. ZNF582 harbours nine zinc finger domains that are essential for DNA binding. The p.Gly345Glu substitution is positioned next to the second cysteine of the C2H2–zinc ion binding motif of the sixth zinc finger domain. The p.Trp65Gly substitution is located in a conserved amino acid sequence of the Kruppel-associated box. Several members of the large KRAB-containing zinc protein family have been implicated in X-linked ID (ZNF41, OMIM 30084838; ZNF81 , OMIM 30049831; and ZNF674, OMIM 30085139). For both transcriptional repressors discussed in this study, BCORL1 and ZNF582, the combination of the mutation characteristics, the gene function, and other gene family members already implicated in ID, support the hypothesis that BCORL1 and ZNF582 may be involved in disease pathology.

Families without candidate mutations

In 11 families, we had either no segregating variants or segregating variants in genes that are unlikely candidates to explain the disease phenotype (see online supplementary table S4). This could be the result of technical limitations—that is, the mutations are located outside the regions captured in our exome experiment, or they are captured but poorly sequenced and/or difficult to map. In some of the families there may also be a different inheritance model than anticipated—for example, one of the parents could be a carrier of a germline mosaicism of a dominantly inherited mutation in an unknown ID gene (these were excluded in our analysis), or a more complex (digenic or multigenic) inheritance pattern could be present in those families.

Discussion

In 19 families with two to five affected siblings, we identified pathogenic mutations in one novel, two known and five candidate syndromic and non-syndromic ID genes. The diagnostic yield of variants that could convincingly be classified as pathogenic is thus three out of 19 families (16%) in this study of a relatively small sample size. Two variants that were classified as pathogenic were identified in X-chromosomal ID genes in two of three affected brother–brother pair families, and mutations in an autosomal gene, DDHD2, were identified in an affected brother–sister pair. This result is not unexpected since X-chromosomal mutations have a considerable contribution to disease in families in which only male siblings are affected.5 Interpretation of variants on the X chromosome is relatively straightforward because, in contrast to ARID genes, the majority of X-chromosomal syndromic and non-syndromic ID genes have already been identified. The diagnostic yield in brother–sister and sister–sister pairs will presumably improve when more autosomal recessive syndromic and non-syndromic ID genes are identified. Potentially pathogenic mutations were found in five genes that have not previously been linked to ID phenotypes, but further clinical and molecular studies are necessary in order to provide conclusive evidence for involvement of these genes in this disorder. Given the heterogeneous nature of ID, we would suggest that there is a need for systematic storage of combined clinical and exome data worldwide in order to facilitate interpretation of all candidate gene mutations for this relatively common disorder.

In this study, variant selection resulted in an average of 13 potentially recessive rare variants per exome that required follow-up by Sanger sequencing, as well as four potentially hemizygous rare variants on the X chromosome in the brother–brother pair families. Segregation testing in combination with variant interpretation and subsequent variant classification reduced the number of potentially causative variants to zero or one per family. These results are comparable to studies that use family-based exome sequencing in isolated ID cases (eg, three exomes, two from the unaffected parents and one from the affected child), which identified zero to two de novo candidate mutations per family,3 ,4 and to studies of large consanguineous families with a combined approach of homozygosity mapping and targeted—but not genome-wide—massive parallel sequencing, to identify homozygous candidate mutations.40 With the unbiased genome-wide approach used in this study, only one exome per family needed to be studied. This indicates that our approach, focusing on rare autosomal recessive and, in the case of brother–brother pairs, X-linked sequence variants in the coding region of the human genome, resulted in a manageable number of candidate mutations for follow-up studies—such as collaborative screening of large cohorts of patients with syndromic and non-syndromic ID for additional mutations, biological assays proving impairment of protein function, and/or animal studies that support a role for these candidate genes in intellectual functioning.

In conclusion, in our study of 19 small non-consanguineous families with between two and five siblings with syndromic or non-syndromic ID, exome sequencing identified definite pathogenic mutations in three families, and potential pathogenic mutations in five, giving a diagnostic yield of more than 16%. These results demonstrate that recessive mutations can be identified in families with as few as two affected siblings, thus disclosing this previously less accessible group of patients with ID to be investigated for identification of novel autosomal recessive and X-linked ID genes. In addition, the identification of unambiguously pathogenic heterozygous mutations in the parents offers these families, which have a high recurrence risk of 25% (AR) or 50% (X-linked; males), the possibility of obtaining molecular diagnosis, genetic counselling and prenatal diagnosis.

Acknowledgments

We are grateful to the patients and their families for their support and cooperation. We thank Jamie M Kramer (Department of Human Genetics, RUNMC, Nijmegen, The Netherlands) for critical review of the manuscript, N de Leeuw (Department of Human Genetics, RUNMC Nijmegen, The Netherlands) for her assistance with whole genome copy number and genotyping analysis, GS Salomons (Department of Clinical Chemistry, VU University Medical Center Amsterdam, The Netherlands) for SLC6A8 mutation analysis, and Z Iqbal (Department of Human Genetics, RUNMC, Nijmegen, The Netherlands) for discussion and support.

References

View Abstract

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:

Footnotes

  • Contributors The study was designed and the results were interpreted by JHMS-H, BBAdV and APMdB. Subject ascertainment and recruitment were carried out by JHMS-H, ATV-vS, BWMvB, SR, AV, PF, DG, TK, MTG, CWO, MHW, EMHFB, CR, MAW, HGB and BBAdV. Sequencing and genotyping were carried out and interpreted by JHMS-H, LELMV, IIGMvdV, JdL, CG, JYH-K, KN, MdR, GH, MF, OG, IMJ, RP, JAV, HvB and APMdB. The manuscript was drafted by JHMS-H, BBAdV and APMdB. All authors contributed to the final version of the paper.

  • Funding This work was funded in part by grants from the Dutch Organisation for Health Research and Development (917-86-319 to BBAdV, 911-08-0205 to JAV), the EU-funded GENCODYS project (EU-7th-2010-241995 to HvB and BBAdV), the Dutch Brain Foundation (2010(1)-30 to APMdB, 2009(1)-22 to BBAdV), the Italian Ministry of Health (Ricerca Corrente 2012 entitled “Le malattie genetiche con ritardo mentale” to CR, PF, DG, MF, OG) and ‘5 per mille’ funding to CR, PF, DG, MF, OG.

  • Competing interests None.

  • Patient consent Obtained.

  • Ethics approval Commissie Mensgebonden Onderzoek Regio Arnhem-Nijmegen (approved by the local ethics committee).

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Linked Articles