Article Text

Download PDFPDF

Original article
Genetic landscape of Rett syndrome-like phenotypes revealed by whole exome sequencing
  1. Kazuhiro Iwama1,2,
  2. Takeshi Mizuguchi1,
  3. Eri Takeshita3,4,
  4. Eiji Nakagawa3,
  5. Tetsuya Okazaki5,
  6. Yoshiko Nomura6,
  7. Yoshitaka Iijima7,
  8. Ichiro Kajiura7,
  9. Kenji Sugai3,
  10. Takashi Saito3,
  11. Masayuki Sasaki3,
  12. Kotaro Yuge8,
  13. Tomoko Saikusa8,
  14. Nobuhiko Okamoto9,
  15. Satoru Takahashi10,
  16. Masano Amamoto11,
  17. Ichiro Tomita11,
  18. Satoko Kumada12,
  19. Yuki Anzai13,
  20. Kyoko Hoshino13,
  21. Aviva Fattal-Valevski14,
  22. Naohide Shiroma15,
  23. Masaharu Ohfu16,
  24. Masaharu Moroto17,18,
  25. Koichi Tanda17,
  26. Tomoko Nakagawa19,
  27. Takafumi Sakakibara19,
  28. Shin Nabatame20,
  29. Muneaki Matsuo21,
  30. Akiko Yamamoto22,
  31. Shoko Yukishita6,
  32. Ken Inoue4,
  33. Chikako Waga4,
  34. Yoko Nakamura4,
  35. Shoko Watanabe23,
  36. Chihiro Ohba1,
  37. Toru Sengoku24,
  38. Atsushi Fujita1,
  39. Satomi Mitsuhashi1,
  40. Satoko Miyatake1,25,
  41. Atsushi Takata1,
  42. Noriko Miyake1,
  43. Kazuhiro Ogata24,
  44. Shuichi Ito2,25,
  45. Hirotomo Saitsu26,
  46. Toyojiro Matsuishi27,
  47. Yu-ichi Goto4,23,
  48. Naomichi Matsumoto1
  1. 1 Department of Human Genetics, Graduate School of Medicine, Yokohama City University, Yokohama, Japan
  2. 2 Department of Pediatrics, Graduate School of Medicine, Yokohama City University, Yokohama, Japan
  3. 3 Department of Child Neurology, National Center Hospital, National Center of Neurology and Psychiatry, Tokyo, Japan
  4. 4 Department of Mental Retardation and Birth Defect Research, National Institute of Neuroscience, National Center of Neurology and Psychiatry, Tokyo, Japan
  5. 5 Department of Pediatrics, Nippon Medical School Tama-Nagayama Hospital, Tokyo, Japan
  6. 6 Yoshiko Nomura Neurological Clinic for Children, Tokyo, Japan
  7. 7 Division of Pediatrics, Osaka Developmental Rehabilitation Center, Osaka, Japan
  8. 8 Department of Pediatrics and Child Health, Kurume University School of Medicine, Fukuoka, Japan
  9. 9 Deparment of Medical Genetics, Osaka Women’s and Children’s Hospital, Osaka, Japan
  10. 10 Department of Pediatrics, Asahikawa Medical University, Hokkaido, Japan
  11. 11 Department of Pediatrics, Kitakyushu Municipal Yahata Hospital, Fukuoka, Japan
  12. 12 Department of Neuropediatrics, Tokyo Metropolitan Neurological Hospital, Tokyo, Japan
  13. 13 Segawa Memorial Neurological Clinic for Children, Tokyo, Japan
  14. 14 Pediatric Neurology Unit, Dana-Dwek Children’s Hospital, Tel Aviv Medical Center and Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
  15. 15 Neurodevelopment Clinic Prop, Okinawa, Japan
  16. 16 Division of Child Neurology, Okinawa Prefectural Nanbu Medical Center and Children’s Medical Center, Okinawa, Japan
  17. 17 Department of Pediatrics, Japanese Red Cross Kyoto Daiichi Hospital, Kyoto, Japan
  18. 18 Kyoto Prefectural Chutan-Nishi Public Health Center, Kyoto, Japan
  19. 19 Department of Pediatrics, Nara Medical University, Nara, Japan
  20. 20 Department of Pediatrics, Graduate School of Medicine, Osaka University, Osaka, Japan
  21. 21 Department of Pediatrics, Saga University, Faculty of Medicine, Saga, Japan
  22. 22 Division of Pediatrics, Tokyo Metropolitan Tobu Medical Center for Persons with Developmental and Multiple Disabilities, Tokyo, Japan
  23. 23 Department of Genome Medicine Development, Medical Genome Center (MGC), National Institute of Neuroscience, National Center of Neurology and Psychiatry, Tokyo, Japan
  24. 24 Department of Biochemistry, Graduate School of Medicine, Yokohama City University, Yokohama, Japan
  25. 25 Clinical Genetics Department, Yokohama City University Hospital, Yokohama, Japan
  26. 26 Department of Biochemistry, Hamamatsu University School of Medicine, Hamamatsu, Japan
  27. 27 Research Center for Children and Research Center for Rett Syndrome, St. Mary’s Hospital, Fukuoka, Japan
  1. Correspondence to Professor Naomichi Matsumoto, Department of Human Genetics, Graduate School of Medicine, Yokohama City University, Yokohama 236-0004, Japan; naomat{at}yokohama-cu.ac.jp

Abstract

Background Rett syndrome (RTT) is a characteristic neurological disease presenting with regressive loss of neurodevelopmental milestones. Typical RTT is generally caused by abnormality of methyl-CpG binding protein 2 (MECP2). Our objective to investigate the genetic landscape of MECP2-negative typical/atypical RTT and RTT-like phenotypes using whole exome sequencing (WES).

Methods We performed WES on 77 MECP2-negative patients either with typical RTT (n=11), atypical RTT (n=22) or RTT-like phenotypes (n=44) incompatible with the RTT criteria.

Results Pathogenic or likely pathogenic single-nucleotide variants in 28 known genes were found in 39 of 77 (50.6%) patients. WES-based CNV analysis revealed pathogenic deletions involving six known genes (including MECP2) in 8 of 77 (10.4%) patients. Overall, diagnostic yield was 47 of 77 (61.0 %). Furthermore, strong candidate variants were found in four novel genes: a de novo variant in each of ATPase H+ transporting V0 subunit A1 (ATP6V0A1), ubiquitin-specific peptidase 8 (USP8) and microtubule-associated serine/threonine kinase 3 (MAST3), as well as biallelic variants in nuclear receptor corepressor 2 (NCOR2).

Conclusions Our study provides a new landscape including additional genetic variants contributing to RTT-like phenotypes, highlighting the importance of comprehensive genetic analysis.

  • rett syndrome
  • whole exome sequencing
  • mast3
  • usp8
  • ncor2

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Rett syndrome (RTT, MIM 312750) is a neurodevelopmental disorder with regressive loss of neurodevelopmental milestones. The frequency of typical RTT is about 1 in 10 000 live female births and approximately 95%–97% of cases harbour pathogenic variants in the gene encoding methyl-CpG binding protein 2 (MECP2).1 Typical and atypical RTT are clinically diagnosed based on the diagnostic criteria.2 RTT-like phenotypes, which are suspected of being associated with RTT by clinicians but incompatible with typical and atypical RTT, have been investigated by different groups primarily using whole exome sequencing (WES).1 3–5 So far, a total of 64 genes have been found to be associated with RTT-like phenotypes and registered in Human Gene Mutation Data (HGMD) V.2018.2 (Qiagen, Venlo, The Netherlands), including MECP2, cyclin-dependent kinase-like 5 (CDKL5) and forkhead box G1 (FOXG1). Patients with RTT-like phenotypes often show severe intellectual disability (ID), developmental delay (DD), autism spectrum disorders (ASD) and/or epilepsy.5 RTT-like phenotypes also overlap with other syndromes such as Angelman syndrome, Pitt-Hopkins syndrome and West syndrome.1 6 The genetic diagnostic rates of RTT-like phenotypes in three previous studies were 8/19 (42.1%),5 13/19 (68.4%)1 and 17/34 (50%),3 although their sample sizes were relatively small. Here, we analyse a total of 77 patients with RTT-like phenotypes by WES. We also describe and discuss the genetic results in detail.

Materials and methods

Patients and MECP2 prescreening

In this study, we initially recruited 115 probands with at least one main or supportive RTT diagnostic criteria.2 They were classified into three categories: (1) typical RTT, (2) atypical RTT or (3) RTT-like phenotypes. Five patients among them had already been reported7–11 (table 1). Typical and atypical RTT were evaluated based on the main criteria for RTT (online supplementary table S1).2 Patients with RTT-like phenotypes were recruited by their clinicians, who considered them to have possible typical or atypical RTT, but the cases turned out to be incompatible with these diagnoses. The patients showing simple developmental delay without main or supportive criteria of RTT were excluded. Based on this classification, 35, 29 and 51 probands were classified into typical RTT, atypical RTT and RTT-like phenotypes, respectively (figure 1). Informed consent was obtained from the patients’ guardians, in accordance with Japanese regulatory requirements. In all patients, Sanger sequencing was used to identify MECP2 single-nucleotide variants (SNVs) in our laboratory. Multiplex ligation-dependent probe amplification (MLPA) was performed in 19 probands.

Figure 1

Flow chart of this study and diagnostic yield in each clinical phenotype group. We initially recruited 115 probands, including 35 with typical Rett syndrome (RTT), 29 with atypical RTT and 51 patients with RTT-like phenotypes. Whole exome sequencing (WES) was performed for 77 of 115 (67.0%) methyl-CpG binding protein 2 (MECP2)-negative probands. In the subgroup of patients without MECP2 prescreening, MECP2-positive rates in typical RTT, atypical RTT and RTT-like phenotypes were 16/20 (80.0%), 7/12 (58.3%) and 5/18 (27.8%), respectively, while in patients who had already received MECP2 prescreening before this study, 10 probands (including 8 with typical RTT and 2 with RTT-like phenotypes) harboured MECP2 abnormalities. Pathogenic or likely pathogenic single-nucleotide variants (SNVs) (according to American College of Medical Genetics/Association of Molecular Pathology criteria) were found in 39/77 (50.6%) probands. WES-based CNV analysis in probands with no causative SNVs identified pathogenic CNVs in 8/77 (10.4%), including three MECP2 deletions and deletions in five other regions. From 30/77 (39.0%) probands, we found four variants of uncertain significance (VUS) in known genes and five candidate variants in four novel genes. MLPA, multiplex ligation-dependent probe amplification.

Table 1

Profiles of 115 patients with RTT, atypical RTT or RTT-like phenotypes

Whole exome sequencing

WES performed as previously described.12 Genomic DNA was isolated from peripheral blood leucocytes using QuickGene 610 L (Wako, Osaka, Japan). WES was performed in 77 probands (figure 1). DNA of peripheral blood leucocytes was captured using the SureSelect XT Human All Exon 50 Mb, v4 (51 Mb), v5 (50 Mb) or v6 (60 Mb) Kit (Agilent Technologies, Santa Clara, California, USA), and sequenced on an Illumina HiSeq 2000 or HiSeq2500 (Illumina, San Diego, California, USA) with 101 bp paired-end reads. R002 and R006 were sequenced on a Genome Analyzer IIx sequencer (Illumina) with 108 bp paired-end reads. Image analysis and base calling were performed using CASAVA software (Illumina). Sequence reads were aligned to GRCh37 and PCR duplicates were excluded using Novoalign V.3.02 (http://www.novocraft.com/) and Picard V.1.98 (http://picard.sourceforge.net/), respectively. Indel realignment and recalibration of base-quality scores were performed using the Genome Analysis ToolKit (GATK V.3.7.0) (http://www.broadinstitute.org/gatk/). Variant calling and annotation were performed using GATK UnifiedGenotyper and ANNOVAR 20160201 (http://www.openbioinformatics.org/annovar/).

The mean read depth of protein-coding regions ranged from 51.7× to 179.8× and an average of 92.1% of target bases were sequenced by 20 or more reads (online supplementary table S2). Common SNPs with minor allele frequencies ≥1% in Single Nucleotide Polymorphism database (dbSNP) 137 and variants that were observed in >5 of our 575 in-house Japanese control exomes were filtered out. Among the remaining rare variants, we focused on amino acid-altering or splice-affecting variants. Particular attention was paid to variants in known causative genes associated with ataxia, cerebellar atrophy and other neurodevelopmental disorders (NDDs). Candidate variants were all confirmed by PCR and Sanger sequencing with an ABI PRISM 3500xl autosequencer (Life Technologies, Carlsbad, California, USA), using genomic DNA from the patients and their parents as a PCR template. Regarding R105, a mosaic variant in MECP2 exon 1 (NM_001110792.1:c.31G>T, p.Gly11*) was confirmed by targeted amplicon sequencing using DNA derived from peripheral blood leucocytes, saliva, nails and hair roots of the affected individuals (online supplementary materials).

WES-based CNV analysis

CNVs were investigated as previously described.13 Briefly, two bioinformatics tools were used: eXome Hidden Markov Model (XHMM)14 15 and Nord’s method,16 which uses a program based on the relative depth of coverage. The Nord program targeted 283 genes reported to be causative of RTT, developmental disorders or epilepsy (online supplementary table S3). Chromosomal imbalances were confirmed by qPCR using a QuantiFast SYBR Green PCR kit (Qiagen) on a Rotor-Gene Q real-time PCR cycler (Qiagen). The relative ratio of genomic DNA copy number was calculated using the standard curve method with Rotor-Gene 6000 Series Software 1.7 (Qiagen) by normalising with syntaxin-binding protein 1 (STXBP1) and fibrillin 1 (FBN1) as internal controls.

Assessment of pathogenicity and nomination of candidate variants

With regard to the autosomal dominant model, all heterozygous candidate variants in known genes were absent from control databases including the Exome Aggregation Consortium (ExAC, http://exac.broadinstitute.org/), Tohoku Medical Megabank Organisation (ToMMo, http://www.megabank.tohoku.ac.jp/tommo) and Genome Aggregation Database (gnomAD, http://gnomad.broadinstitute.org/). For the autosomal recessive and X linked model, the allele frequency of candidate variants was <1% in all of the above databases. We assessed the pathogenicity of SNVs in known genes in accordance with the American College of Medical Genetics, Association of Molecular Pathology (ACMG/AMP) guidelines.17 For novel genes, we picked up candidate variants based on prediction scores of web-based tools as follows: SIFT ≤0.05, PolyPhen2 (HVAR) ≥0.909, MutationTaster >0.95 and CADD >20.

Gene ontology and interactive gene network analyses

We focused on 50 genes related to RTT-like phenotypes, including 30 known and 4 novel genes found in this study, and 16 genes for which solid evidence of such an association was reported in previous studies.1 3–5 18 19 These 50 genes were classified into four gene ontology (GO) terms based on GeneCards (https://www.genecards.org). GO enrichment analysis was performed using GO term finder, comparing the 50 genes with 19 663 reference genes regarding 475 079 codes for molecular functions. P values were corrected for multiple testing by the Bonferroni procedure. Interactive gene networks were analysed using three networks (physical interactions, pathways and shared protein domains) of GeneMANIA (https://genemania.org/) with the default parameters.20

Results

Overview of variant detection

A total of 38 patients harboured MECP2 abnormalities (35 SNVs and 3 deletions detected by MLPA, even in patients who had previously undergone MECP2 sequencing). MLPA was performed in only 19 patients. In a subgroup of patients with no MECP2 prescreening, MECP2-positive rates in typical RTT, atypical RTT and RTT-like phenotypes were 16/20 (80.0%), 7/12 (58.3%) and 5/18 (27.8%), respectively, suggesting that RTT patients showed similar high rates of MECP2 abnormalities (figure 1). After excluding these 38 patients, 77 (67.0%) patients were analysed by WES. Pathogenic or likely pathogenic SNVs (according to ACMG/AMP criteria17) were found in 39 (50.6%) probands (table 2). WES-based CNV analyses were performed in 38 SNV-negative patients. We found pathogenic CNVs in 8/77 (10.4%) families, involving three MECP2 deletions and deletions in five other genes (table 2). Overall, a diagnostic rate of 61.0% was achieved (SNVs: 50.6% and CNVs: 10.4%). From the other 30 (39.0%) families, we found four variants of uncertain significance (VUS) in four known genes (online supplementary table S4) and five potential pathogenic variants in four candidate genes (table 3).

Table 2

Summary of variant detection

Table 3

Five likely pathogenic variants in four candidate genes

Pathogenic SNVs in known genes

We detected 37 heterozygous pathogenic or likely pathogenic variants (including 16 novel and 21 reported variants) in 26 genes (21 autosomal and 5 X linked) (table 2), one somatic mosaic variant in MECP2 and biallelic variants in palmitoyl-protein thioesterase 1 (PPT1). Pathogenic variants in WD repeat domain 45 (WDR45), which have been reported in β-propeller protein-associated neurodegeneration (BPAN), were found in four probands. We previously reported WDR45 variants in RTT-like phenotypes (females) or West syndrome (males) in childhood.8 21 22 We also found four variants in STXBP1, whose variants cause early-onset epileptic encephalopathy6 23; three variants in SH3 and multiple ankyrin repeat domain 3 (SHANK3), which is associated with ASD24; three variants in ubiquitin protein ligase E3A (UBE3A) associated with Angelman syndrome6 and two variants in FOXG1. In addition to variants in these genes, a single variant was also found in each of 21 known genes related to epileptic encephalopathy, DD, ID, ASD, spinocerebellar ataxia or other syndromes with DD. These results indicated the difficulty in excluding these diseases only by clinical evaluation. All of the mutated genes and their related phenotypes are summarised in table 2. R130 harboured a recurrent de novo calcium/calmodulin-dependent protein kinase II beta (CAMK2B) variant (NM_001220.4:c.416C>T, p.Pro139Leu), which was recently reported to be a causative variant of ID and ASD.25 p.Pro139Leu in CAMK2B is a gain-of-function variant that increases Thr287 autophosphorylation. Both gain-of-function and loss-of-function mutations in Camk2b cause neuronal migration deficits in mice. Patients with the same variant (c.416C>T) showed microcephaly.25 R130 showed the atrophy of frontal lobes and bilateral hippocampus. R122 was shown to harbour a de novo cut-like homeobox 2 (CUX2) variant (NM_015267.3: c.1768G>A, p.Glu590Lys), which was also recurrently found in three patients with ID or ASD.26 CUX2 encodes a transcription factor that controls proliferation and differentiation in neural specification.26 R110 showed a missense variant (NM_024496.3:c.1402T>C, p.Ser468Pro) in interferon regulatory factor 2 binding protein-like gene (IRF2BPL), which is highly expressed in a broad range of tissues. IRF2BPL increased the ubiquitination of β-catenin, which is a critical regulator for neuronal development and synaptic plasticity. It has very recently been reported that IRF2BPL variants were found in seven patients with hypotonia, seizures and progressive cerebral atrophy.27 In R129, we found a missense variant (NM_000720.3:c.856G>C, p.Ala286Pro) in calcium voltage-gated channel subunit alpha 1D (CACNA1D) encoding L-type voltage-gated Ca2+ channels (Cav1.3). Two missense variants (p.Gly407Arg and p.Ala749Gly) in ASD probands have also been reported as gain-of-function variants, which increase peak current amplitudes and channel activity.28 R002 harboured a missense variant (NM_018896.4:c.2881G>A, p.Ala961Thr) in calcium voltage-gated channel subunit alpha 1G (CACNA1G) encoding T-type low-voltage-activated Ca2+ channels (Cav3.1). A recurrent CACNA1G missense variant (p.Arg1715His) in cerebellar ataxia proband induced the delayed onset of spikes and the decrease of neuronal excitability.29 30

A male patient (R105) harboured a somatic mosaic variant (NM_00111079.1:c.31G>T, p.Gly11*) at exon 1 of the MeCP2 E1 isoform (MeCP2E1), leading to the skipping of exon 2 (NM_001110792.1). In the brain, MeCP2E1 is a major isoform that is expressed at a higher level than the MeCP2 E2 isoform containing exons 1–4 in the cortex (NM_004992.3) (online supplementary figure S1a).31 Retrospectively, this variant was found in MECP2 variant prescreening, but was missed due to a lower peak of the variant allele in Sanger sequencing. Deep sequencing and PCR-cloning Sanger sequencing indicated variant allele frequencies of 29%, 31%, 40% and 2% in DNA from blood leucocytes, saliva, nails and hair roots, respectively (online supplementary table S1b and c). A small number of variants in MeCP2E1 isoform have been reported. To the best of our knowledge, a mosaic variant in MeCP2E1 isoform has never been reported.

Pathogenic CNVs

Using XHMM algorithms, we found four deletions. R082 harboured a 2.9 Mb deletion at 2p23 involving DNA methyltransferase 3 alpha (DNMT3A). The 2p23 deletion is known to be associated with Tatton-Brown-Rahman syndrome, characterised by overgrowth and ID (online supplementary figure S2a).32 In R008, we found a 960 kb deletion at 9q34.11 involving STXBP1 and endoglin (ENG), the latter of which is associated autosomal dominant haemorrhagic telangiectasia (online supplementary figure S2b). In addition, R112 harboured a 16.2 kb deletion at Xp11.23 involving WDR45 (online supplementary figure S2c). Furthermore, R078 harboured a 7.2 Mb deletion at 22q13 involving SHANK3 (online supplementary figure S2d). The 22q13 deletion was reported in Phelan-McDermid syndrome showing DD, autistic features or RTT.24 Moreover, Nord’s method revealed four smaller deletions involving myocyte enhancer factor 2C (MEF2C) in SN12 (online supplementary figure S2e) and three deletions involving MECP2 in three cases (SN05, SN14 and R040), which did not undergo MLPA targeting MECP2. MEF2C is known as a gene associated with RTT-like phenotypes.1 Regarding the three MECP2 deletions, the smallest deletion was about 1 kb in size, and discordant read pairs of WES data allowed us to determine breakpoints in all of them (online supplementary figure S3). No pathogenic duplication was found in this study.

Five pathogenic variants in four candidate genes

In 30 unsolved families in which pathogenic variant had not been resolved, a VUS in each of UBE3A, protocadherin 19 (PCDH19), transcription factor 4 (TCF4) and glutamate receptor ionotropic, N-methyl D-aspartate 2B (GRIN2B) was found in SN07, SN16, R016 and R048, respectively. The VUS status in these cases is due to familial samples not being available for analysis (online supplementary table S4). In 26 probands (including 15 trio and 3 quad samples), we found 10 de novo protein-altering variants by trio-based WES, which are not registered in ExAC, gnomAD or ToMMo. After excluding benign variants as predicted by web-based tools, four de novo variants remained: in ATPase H+ transporting V0 subunit A1 (ATP6V0A1), ubiquitin-specific peptidase 8 (USP8) and microtubule-associated serine/threonine kinase 3 (MAST3). We also found biallelic variants in nuclear receptor corepressor 2 (NCOR2) with an allele frequency of <0.1% (table 3). Among these, the variants in ATP6V0A1, and NCOR2 have been reported in a few patients with ID, DD or RTT-like phenotypes,1 33–36 but those in MAST3 and USP8 have not previously been reported. We summarised clinical features of the patients harbouring variants in novel genes in online supplementary table S5.

SN04 harboured a missense MAST1 variant (NM_014975.2:c.1549G>A, p.Gly517Ser, figure 2A). The same missense variant recently reported as a genetic cause of intellectual disability and cerebellar hypoplasia. In a mouse model, a single amino acid deletion in Mast1 (Leu287del) showed hypoplastic cerebellum and reduction in cortical volume.37 In our current study, R132 showed a missense MAST3 variant (NM_015016.1:c.1528G>A, p.Gly510Ser). The two missense variants in MAST1 and MAST3 are both located at the third position of the so-called ‘Asp-Phe-Gly (DFG)’ motif,38 with the same amino acid change (glycine to serine) (figure 2A). The DFG motif is highly conserved among the protein kinase family and plays an important role for regulation of the catalytic activity. Analogous to the similar glycine-to-serine change in another member of the kinase family, p.Gly2019Ser in leucine-rich repeat kinase 2 (LRRK2) is a well-known pathogenic gain-of-function variant, which increases kinase activity and causes autosomal dominant parkinsonism39 (figure 2B). Therefore, the two variants in MAST1 and MAST3 must have large biological impacts.

Figure 2

Schematic presentation of MAST1 and MAST3 variants and interactive gene network among 50 mutated genes found in association with Rett syndrome (RTT)-like phenotypes. (A) Schematic presentation of MAST1 and MAST3 proteins and candidate variants. Coloured squares indicate protein kinase domain (blue), AGC kinase C-terminal domain (orange) and PDZ domain (red). p.(Gly517Ser) in MAST1 and p.(Gly510Ser) in MAST3 were found in this study (black arrows). (B) The activation segment from Asp-Phe-Gly (or Asp-Tyr-Gly) to Ala-Pro-Glu in MAST1, MAST3, and LRRK2 is aligned with a purple bar. Multiple sequence alignment was performed by ClustalW (https://www.genome.jp/tools-bin/clustalw). p.(Gly2019Ser) in leucine-rich repeat kinase 2 (LRRK2) is a well-known pathogenic mutation, which increases kinase activity and causes autosomal dominant parkinsonism (highlighted in light green). p.(Gly517Ser), p.(Gly510Ser) and p.(Gly2019Ser) in MAST1, MAST3 and LRRK2 protein, respectively, are located in the same conserved position in protein kinase domains (red arrowheads). The sequence logo established from 4360 protein kinase domains in Prosite (https://prosite.expasy.org) indicates high conservation of this glycine among several proteins with the same domain. (C) Fifty genes related to RTT-like phenotypes were selected, including 30 known and 4 novel genes in this study (with blue circles), and 16 genes reported with solid evidence in previous studies (dark blue circles) 1 3–5 18 19. Interactive gene networks were analysed using three networks (physical interactions, pathways and shared protein domains) of GeneMANIA (https://genemania.org/).20 GeneMANIA indicated the interaction among the 50 genes and 20 other genes, including 11 genes known to be associated with neurodevelopmental disorders (NDDs, with green circles) and non-neurological diseases (with purple circles). The remaining nine genes are potential candidates (albeit unproven) related to NDDs (with orange circles). The size of circles indicates the degree of interactive strength. We modified the output of GeneMANIA, manually drawing the connection line between USP8 and SHANK3, based on the report that the USP8 protein is a key deubiquitinating enzyme regulating the SHANK3 protein.41

R095 possessed a missense USP8 variant (NM_005154.4:c.3037C>T, p.Arg1013Cys). p.Arg1013 is conserved in the deubiquitinase catalytic domain of the USP8 protein (online supplementary figure S4a). Somatic mutations in the 14-3-3 protein binding motif of USP8 (highlighted in green) have been reported in tumours from patients with Cushing disease. The 14-3-3 proteins regulate the deubiquitinating activity of USP8. The somatic mutations in the 14-3-3 binding motif increase the proteolytic cleavage of USP8, induce the recycling of endocytosed epidermal growth factor receptor (EGFR) and substantial accumulation of EGFR on the plasma membrane and result in tumourigenesis through adrenocorticotropin synthesis.40 The USP8 protein has also recently been reported to be a key deubiquitinating enzyme regulating the SHANK3 protein, which is important in neurodevelopment.41 The crystal structure of the catalytic domain of USP8 (residues 734–1110) has been determined (online supplementary figure S4d).42 This structure exhibits an inactive state, where two loops, referred to as blocking loops 1 and 2 (BL1 and BL2), adopt a closed conformation to block the binding of the ubiquitinated substrate. These loops have been shown to recognise the substrate and establish a lid over the active sites, forming a cleft including the conserved catalytic cysteine in deubiquitinases. In the presence of the ubiquitinated substrates, this cleft is likely to accommodate the C-terminal region of ubiquitin by changing the conformation of these two loops, as has been seen in the structures of the ubiquitin-bound complex of a homologous deubiquitinase, UPS2 (online supplementary figure S4e).43 Thus, it is proposed that the flexibility of these loops is important for the biological functions of USP8.42 Arg1013 is located immediately at the N-terminal side of one of these loops, BL1 (residues 1014–1023), and in the crystal structure it forms a network of hydrogen bonds with main-chain carbonyl oxygens of Leu1024, Thr1026 and Ala1101 (online supplementary figure S4f), the pattern of which is structurally conserved in USP2, USP12, USP21, USP30, USP35 and USP46 (PDB IDs 2HD5, 5K16, 3I3T, 5OHK, 5TXK and 5CVM, respectively). Therefore, replacing Arg1013 with a cysteine would disrupt the network of these hydrogen bonds, affecting the local structure and/or the flexibility around BL1, and the catalytic activity of the enzyme.

SN01 was found to have a missense variant (NM_001130020.1:c.2222G>A, p.Arg741Gln, online supplementary figure S4b) in ATP6V0A1 encoding a subunit of vacuolar ATPase (v-ATPase), which mediates pH homeostasis in eukaryotic intracellular organelles including neurons. c.2222G>A has already been found in two patients in the updated Deciphering Developmental Disorders Study.36 Furthermore, four missense variants in ATPase H+ transporting V1 subunit A (ATP6V1A), which is another v-ATPase subunit, have been reported to be causative of developmental encephalopathy with epilepsy.44 These findings support the pathogenicity of c.2222G>A.

R096 had compound heterozygous variants (c.3983A>G, p.Glu1328Gly and c.1399G>A, p.Var467Ile based on NM_006312.5, online supplementary figure S4c) in NCOR2 encoding a nuclear coreceptor with nuclear receptor corepressor 1 (NCoR1) protein. A heterozygous variant (c.1940C>T, p.Ser647Leu) in NCOR2 has been reported to be a candidate variant in RTT-like phenotypes,1 but no definitive conclusions on its pathogenicity can be drawn as it is present in gnomAD with unknown inheritance. A core function of MeCP2 is the recruitment of NCoR/NCoR2 corepressor to DNA; moreover, pathogenic mutations in MECP2 were found to decrease the interaction between MeCP2 and the NCoR1/NCoR2 complex.45 The pathogenicity of NCOR2 variants as determined by web-based prediction tools (PolyPhen2 HVAR, MutationTaster and CADD) and the extreme rarity of compound heterozygous NCOR2 variants (only one biallelic variant in 11 837 in-house WES sets of data, 0.0084%; online supplementary figure S5) strongly support their pathogenicity.

Gene ontology and interactive gene network analyses

The diversity of genes with pathogenic or likely pathogenic variants in this study indicates the genetic heterogeneity of RTT-like phenotypes under the complex biological pathways. Specifically, it is notable that variants in WDR45, STXBP1 and SHANK3 were more common than those of FOXG1 and CDKL5, two established genes causative of atypical RTT. To delineate the biological pathways and interactive gene networks involved in RTT-like phenotypes, we collected 50 genes mutated in cases with RTT-like phenotypes (30 known and 4 novel genes in this study, and 16 genes reported in previous studies1 3–5 18 19) and assessed their functional properties. GO enrichment analysis of these 50 genes using GO term finder identified significant enrichment of various GO terms, such as ion transport, synaptic signalling and nervous system development, with Bonferroni-corrected p values of 9.17×10−12, 1.27×10−11 and 2.16×10−9, respectively (online supplementary table S6). Based on the information of GO terms assigned for each gene, we classified these 50 genes into four distinct biological groups: (1) synaptic signalling including glutamate or gamma-aminobutyric acid (GABA) receptor and other synaptic signalling regulators (eg, GRIN2B and IRF2BPL); (2) ion transport including sodium, potassium and calcium channels and proton transporters (eg, CACNA1G and ATP6V0A1); (3) nucleic acid binding for the regulation of transcription (eg, NR2F1 and NCOR2) and (4) others, including autophagy pathway, enzymatic function and matrix structure components (eg, WDR45 and PPT1) (figure 2C, supplementary figure S6). Interactive gene network analysis of the 50 genes with GeneMANIA20 could visually delineate the biological pathways involved in RTT-like phenotypes by selecting 20 additional interaction partner genes. Notably, 8 out of these 20 newly selected genes are known to be involved in other NDDs, which include eukaryotic translation initiation factor 2B subunit alpha (EIF2B1), eukaryotic translation initiation factor 2B subunit delta (EIF2B4) and structural maintenance of chromosomes 3 (SMC3) (online supplementary table S7).

Discussion

We performed WES of 77 probands with negativity for MECP2 in prescreening and found pathogenic or likely pathogenic SNVs/CNVs in 47 of 77 (61.0%) probands, including three MECP2 deletions. The three patients with MECP2 deletions had not undergone any previous MLPA analysis. In previous studies, CNVs were detected in 2 of 19 (10.5%)5 and 1 of 19 cases (5.3%),1 using the combination of WES and array comparative genomic hybridisation. In this study, deletions in genes other than MECP2 were found in 5 of 77 cases (6.5%), including small deletions (16 kb and 1 kb in size), suggesting that WES-based CNV analysis has some advantages in the detection of genomic imbalances. No pathogenic duplications were found, but benign inherited duplications were also detected (data not shown). Regarding inheritance models, de novo variants were the major component (39/47, 83.0%), while recessive inheritance was rare (1/47, 2.1%). This may be useful information for genetic counselling. Notably, a maternally inherited variant in UBE3A was also found. Variants in genomic imprinting genes may have possible pathogenicity, even if they are inherited from healthy parent. Careful evaluation of inheritance is needed if imprinted genes are involved.

In this study, all patients were assessed by experienced paediatric neurologists in accordance with RTT main criteria.2 In the subgroup of patients without MECP2 prescreening, the MECP2-positive rate was high (80.0%), as expected, indicating that the clinical evaluation of the RTT criteria in our cohort had been reliable. As for additional or characteristic clinical features revealed to be associated with the detected variants, clinicians confirmed their clinical consistency. For example, inositol 1,4,5-trisphosphate receptor type 1 (ITPR1) and kinesin family member 1A (KIF1A), variants of which were found in R115 and R127, were indeed associated with mild cerebellar atrophy. Our study as a result included metabolic disorders (PPT1) and COL4A1-related brain angiopathy, which should be excluded by clinical evaluation. However, the patients with these variants showed broad and variable (atypical) clinical features. Therefore, it is indeed difficult to completely exclude these diseases only by clinical evaluation.

In the GO enrichment and interactive gene network analyses of 50 RTT-related genes, we found ion transport and synaptic signalling as the major biological pathways involved in RTT-like phenotypes. Pathogenic variants in genes related to signal transduction or ion transport (channels and transporters) are known to cause DD/ID, epilepsy and other NDDs. As imbalances in the GABAergic and glutamatergic systems have previously been observed in the brains of Mecp2-deficient mice,46 it would be reasonable to observe variants in these genes in RTT-like phenotypes. Similarly, it was an expected result to find pathogenic variants in genes encoding DNA-binding proteins, such as TCF4 and DNMT3A, whose abnormalities were originally found in different syndromes (eg, Pitt-Hopkins syndrome and Tatton-Brown-Rahman syndrome, respectively) that share symptoms with RTT. Another potentially interesting finding is that interactive gene network analysis newly picked up 20 other genes that interact with 50 input genes. Picked 20 genes include eight and three genes mutated in NDDs and non-neurological diseases, respectively (figure 2C and online supplementary figure S6). Given the high enrichment of known NDD genes among the newly identified interaction partner genes, the remaining nine genes are potential candidates for RTT/NDD genes (orange circles in figure 2C). Reanalysis of each of the nine new candidates showed their strong interactions with genes associated with RTT-like phenotypes and other NDDs (online supplementary figure S7). In particular, two paralogs of IRF2BPL (a gene mutated in our cohort), interferon regulatory factor 2 binding protein 2 (IRF2BP2) and interferon regulatory factor 2 binding protein 1 (IRF2BP1) exhibit physical interactions. Growth factor-independent 1 transcriptional repressor (GFI1) and DNA methyltransferase 3-like (DNMT3L) share a common pathway with DNMT3A and physical interaction with it, respectively. ATPase H+ transporting V0 subunit A2 (ATP6V0A2) and ATP6V0A1 share protein domains (V-type ATPase 116 kDa subunit). SMG5 encoding nonsense-mediated mRNA decay factor (SMG5) is a paralog of SMG9, whose variants were found in autosomal recessive multiple congenital anomaly.47 These genes that strongly interact with genes mutated in our cohort or known NDD genes could be good candidates for new RTT/NDD genes.

In conclusion, we achieved a rate of genetic diagnosis of 61.0% (SNVs 50.6% and CNVs 10.4%) by WES, suggesting that WES is the first choice for diagnosing RTT-like phenotypes. Moreover, we provided additional evidence of the involvement of CACNA1D and CUX2 variants in RTT-like phenotypes, as well as similar novel evidence for the possible involvement of MAST3, ATP6V0A1, USP8 and NCOR2 variants. Furthermore, interactive gene network analysis of 50 genes related to RTT-like phenotypes potentially predicted several candidates for NDDs and RTT-like phenotypes. These findings contribute to a more comprehensive understanding of the causative genes of RTT-like phenotypes. Further studies of variants and functional evidence to support this should confer a better understanding of RTT-like phenotypes.

Acknowledgments

The authors would like to thank the individuals and their families for their participation in this study. The authors would like to thank Nobuko Watanabe and Mai Sato for technical assistance. The authors would also like to thank Edanz Group (www.edanzediting.com/ac) for editing a draft of this manuscript.

References

Footnotes

  • Contributors KaI and NaM designed the analyses, collected and interpreted the data, and wrote the manuscript. KaI and AT performed the bioinformatic analysis. ToSen and KO performed structural analyses. AF performed targeted amplicon sequencing. ET, SY, KeI, CW, YNa, SW and YG performed the prescreening for patients before enrolling patients to this study. ET, EN, TO, YI, YNo, IK, KS, TaSai, MS, KY, ToSai, NO, ST, MA, IT, SK, YA, KH, AFV, NS, MO, MaM, KT, TN, TaSak, SN, MuM, AY and ToM performed clinical evaluation of patients. TaM, CO, SMit, SMiy, AT, NoM, SI and HS interpreted the results, and critically reviewed the manuscript. All authors reviewed and approved the manuscript.

  • Funding This work was supported by Japan Agency for Medical Research and Development (AMED) under grant numbers JP18ek0109280, JP18dm0107090, JP18ek0109301, JP18ek0109348 and JP18kk020500; by JSPS KAKENHI under grant numbers JP17H01539, JP16H05357, JP17K10080 and JP17K15630; the Ministry of Health, Labour and Welfare and Takeda Science Foundation.

  • Competing interests None declared.

  • Ethics approval This study was approved by the Institutional Review Board of Yokohama City University of Medicine.

  • Provenance and peer review Not commissioned; internally peer reviewed.

  • Patient consent for publication Obtained.