Article Text

Download PDFPDF

Short report
Long-read sequencing identified intronic repeat expansions in SAMD12 from Chinese pedigrees affected with familial cortical myoclonic tremor with epilepsy
Free
  1. Sheng Zeng1,
  2. Mei-yun Zhang2,
  3. Xue-jing Wang3,4,
  4. Zheng-mao Hu5,
  5. Jin-chen Li5,6,
  6. Nan Li1,
  7. Jun-ling Wang1,5,7,
  8. Fan Liang8,
  9. Qi Yang8,
  10. Qian Liu9,10,
  11. Li Fang9,10,
  12. Jun-wei Hao11,
  13. Fu-dong Shi11,
  14. Xue-bing Ding3,4,
  15. Jun-fang Teng3,4,
  16. Xiao-meng Yin1,4,
  17. Hong Jiang1,5,6,7,
  18. Wei-ping Liao12,
  19. Jing-yu Liu13,
  20. Kai Wang9,10,
  21. Kun Xia5,
  22. Bei-sha Tang1,5,6,7,14,15,16
  1. 1 Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
  2. 2 Department of Neurology, Tianjin Union Medical Center, Tianjin, China
  3. 3 Department of Neurology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
  4. 4 Institute of Parkinson and Movement Disorder, Zhengzhou University, Zhengzhou, Hunan, China
  5. 5 Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, China
  6. 6 National Clinical Research Center for Geriatric Disorders, Central South University, Changsha, Hunan, China
  7. 7 Key Laboratory of Hunan Province in Neurodegenerative Disorders, Central South University, Changsha, Hunan, China
  8. 8 GrandOmics Biosciences, Beijing, China
  9. 9 Raymond G Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
  10. 10 Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
  11. 11 Department of Neurology, General Hospital, Tianjin Medical University, Tianjin, China
  12. 12 Institute of Neuroscience, Department of Neurology of The Second Affiliated Hospital of Guangzhou Medical University, Key Laboratory of Neurogenetics and Channelopathies of Guangdong Province and Ministry of Education of China, Guangzhou Medical University, Guangzhou, China
  13. 13 Key Laboratory of Molecular Biophysics of the Ministry of Education, College of Life Science and Technology and Center for Human Genome Research, Huazhong University of Science and Technology, Wuhan, China
  14. 14 Parkinson’s Disease Center of Beijing Institute for Brain Disorders, Beijing, China
  15. 15 Collaborative Innovation Center for Brain Science and China, Shanghai, China
  16. 16 Collaborative Innovation Center for Genetics and Development, Shanghai, China
  1. Correspondence to Prof. Kai Wang, Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA ; wangk{at}email.chop.edu, Prof. Kun Xia, Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan 410078, China ; xiakun48{at}163.com and Dr Bei-sha Tang, Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China; bstang7398{at}163.com

Abstract

Background The locus for familial cortical myoclonic tremor with epilepsy (FCMTE) has long been mapped to 8q24 in linkage studies, but the causative mutations remain unclear. Recently, expansions of intronic TTTCA and TTTTA repeat motifs within SAMD12 were found to be involved in the pathogenesis of FCMTE in Japanese pedigrees. We aim to identify the causative mutations of FCMTE in Chinese pedigrees.

Methods We performed genetic linkage analysis by microsatellite markers in a five-generation Chinese pedigree with 55 members. We also used array-comparative genomic hybridisation (CGH) and next-generation sequencing (NGS) technologies (whole-exome sequencing, capture region deep sequencing and whole-genome sequencing) to identify the causative mutations in the disease locus. Recently, we used low-coverage (~10×) long-read genome sequencing (LRS) on the PacBio Sequel and Oxford Nanopore platforms to identify the causative mutations, and used repeat-primed PCR for validation of the repeat expansions.

Results Linkage analysis mapped the disease locus to 8q23.3–24.23. Array-CGH and NGS failed to identify causative mutations in this locus. LRS identified the intronic TTTCA and TTTTA repeat expansions in SAMD12 as the causative mutations, thus corroborating the recently published results in Japanese pedigrees.

Conclusions We identified the pentanucleotide repeat expansion in SAMD12 as the causative mutation in Chinese FCMTE pedigrees. Our study also suggested that LRS is an effective tool for molecular diagnosis of genetic disorders, especially for neurological diseases that cannot be positively diagnosed by conventional clinical microarray and NGS technologies.

  • samd12
  • familial cortical myoclonic tremor with epilepsy
  • long read sequencing
  • repeat expansion

Statistics from Altmetric.com

Introduction

Familial cortical myoclonic tremor with epilepsy (FCMTE) is an autosomal inherited disease mainly characterised by adult-onset cortical myoclonus and infrequent epilepsy.1 Since the first case was described in 1985, additional pedigrees with slightly different phenotypes have been reported worldwide.2–6 In China, nearly 20 FCMTE pedigrees have been reported. Mapping of the disease locus by genetic linkage studies in the past few decades proposed multiple loci: 8q24 (FCMTE1, %601068),7 2q11 (FCMTE2, #607876),2 5p15 (FCMTE3, %613608),4 3q26.32-q28 (FCMTE4, %615127)6 and 1q32 (FCMTE5, #615400).5 In particular, the ADRA2B, CTNND2 and CNTN2 genes were reported to be the disease-causing genes for FCMTE2, FCMTE3 and FCMTE5, respectively.5 8 9 However, none of causative mutations had been reported for FCMTE1, although the disease locus has been mapped to 8q24 in genetic linkage studies. Recently, expansions of intronic TTTCA and TTTTA repeat motifs were found to be involved in the pathogenesis of FCMTE1 in Japanese pedigrees.10 A more recent report also identified this causative mutation in FCMTE1 pedigrees of another ethic background.11

Our study aims to examine the genetic aetiology of FCMTE1 in Chinese pedigrees. Over the past few decades, we have used a number of genomic technologies, including array-comparative genomic hybridisation (CGH) and next-generation whole-exome, region capture and whole-genome sequencing (WGS), but failed to identify likely pathogenic mutations under the linkage peak. More recently, we used long-read genome sequencing (LRS) on the PacBio Sequel platform and the Oxford Nanopore platform, and identified pentanucleotide repeat expansions (TTTCA/TTTTA) within SAMD12 as a major genetic aetiological factor in FCMTE pathogenesis in a different ethnic background.

Materials and methods

Materials

The family was a five-generation Chinese pedigree with 55 members presenting as autosomal dominant inheritance (figure 1A, a). All the available affected individuals were subjected to a medical history review and thorough neurological examination by two or more experienced neurologists. Due to the existence of potential bias of clinical history recalled by offspring in the first few generations, clinical anticipation was only analysed in members between the fourth and fifth generations. All the affected members took the Wechsler Adult Intelligence Scale (WAIS) test. Blood test, chromosome karyotype analysis of peripheral lymphocytes, brain MRI, somatosensory evoked potentials (SEPs), surface electromyogram (sEMG), interval EEG and brain electrical activity mapping (BEAM) were examined for the proband and four affected individuals (IV:11, IV:14, IV:17 and V:17). A total of 24 subjects were included in the linkage study. Two individuals (IV:10 and V:17) were used for high-resolution microarray-based CGH. Two individuals (IV:5 and IV:17) were used for whole-exome sequencing (WES). Three individuals (V:12, V:13 and IV:5) were used for target-region capture and deep sequencing (TRS). Six individuals (III:4, IV:14, IV:17, IV:18, IV:19 and V:19) were used for WGS. One individual (IV:17) was subjected to LRS on the PacBio Sequel and Oxford Nanopore platforms. Repeat-primed PCR (RP-PCR) was used for confirmation of repeat expansion for all family members. Another two pedigrees from our collaborative teams were also included in this study, and three patients were used for mutation analysis (figure 1A, b/c). DNA was extracted from blood stored in EDTA tubes using the salting out procedure.  All participants and their family members were fully informed of the goal of the study. A workflow of our study on finding causative mutations is shown in figure 1B.

Figure 1

Pedigrees and methodological workflow. (A) Pedigrees of the families included in our analysis. Family ‘a’ represents the original pedigree used for linkage analysis and mutation detection. A red solid triangle represents the sample used for mutation confirmation in another two families (b and c). (B) Methodological workflow of searching causative mutation: chromosome karyotype analysis and array-CGH were used to detect causative structural variations and negative results were obtained; genotyping and linkage analysis mapped causative gene to chromosome 8; WES, WGS and TRS represent NGS-related technologies to detect causative variations. WES was the first tier to screen SNVs and InDels in the coding region, TRS detected exonic rearrangements, SNVs and InDels in the coding region, and WGS screened possible CNVs in addition to SNVs and InDels of genome. Negative results were obtained in all the NGS-related technologies. Finally, LRS was performed to detect complex structural variations focusing on expansion of STR and causative mutation was identified. Additional methods (Sanger sequencing, QPCR, repeat-primed PCR and so on) were used for validation of the repeat expansion. CGH, comparative genomic hybridisation; InDels, insertions and deletions; LRS, long-read genome sequencing; NGS, next-generation sequencing; SNV, single nucleotide variant; STR, short tandem repeat; TRS, target-region capture and deep sequencing; WES, whole-exome sequencing; WGS, whole-genome sequencing.

An overview of methods in our previous study

At first, linkage analysis was performed on our family using Linkage Designer (V.1.0) and MLINK of the Linkage Analysis Package (V.5.1). Then the Agilent SurePrint G3 Human CGH Microarray with more than 1 million oligonucleotide probes and comprehensive probe coverage (Agilent Technologies, Santa Clara, California) was used to detect CNVs. Then a few next-generation sequencing (NGS) and analysis strategies were used to determine potential causal single nucleotide variants (SNVs), InDels (insertions and deletions), exonic rearrangements and structural variations (SVs) in mapping region. WES, TRS and WGS were performed successively, and sequencing data passing quality control (QC) were subject to a computational pipeline for data processing and analysis, following the standard workflow. Briefly, Burrows-Wheeler Alignment Maximal Exact Matches (BWA-MEM) was used for read mapping, GATK was used for SNVs/InDels calling, DELLY and LUMPY were used for SV calling, ANNOVAR12 was used for variant annotation, and we manually examined all potential disease variants in Integrative Genomics Viewer. Additional detailed analytical methods are described in online supplementary methods.

Long-read genome sequencing

Since no disease-causing variant was found with conventional short-read sequencing (SRS), we hypothesised that the disease may be caused by complex SVs. Therefore, low-coverage LRS was performed for an affected individual (IV:17) on the PacBio Sequel platform. Mapping-based and assembly-based methods were conducted for regular SV calling. For the specific detection of short tandem repeat (STR) expansion, we developed an analytical strategy combining both RepeatHMM (https://github.com/WGLab/RepeatHMM.git) and inScan (https://github.com/Nextomics/inScan) in the target region. Specifically, RepeatHMM is a validated algorithm to detect repeat counts of microsatellites from long-read sequencing data. inScan is a tool for finding genomic insertion variation from long reads, specifically checking if there were insertions for a given region.

To further analyse the detailed repeat structure of disease-causing variant in our patient, we used the Oxford Nanopore platform to sequence the same individual (IV:17). We collected reads that mapped to sequences flanking the expanded repeats and detected STR expansion using the same combined analytical strategy described above. Detailed analysis process is described in online supplementary methods.

Other methods

Other methods including RP-PCR used to validate candidate variants are described in online supplementary methods.

Results

Clinical characteristics of individuals affected with FCMTE

All affected individuals manifested typical symptoms of FCMTE. Seven affected individuals presented only tremor symptoms, while the remaining 17 patients manifested tremor and infrequent epilepsy in succession. Tremor was present in adults, with an average age of onset of 27.81±7.17 years in all affected members. Except two affected members (IV:4 and IV:14), all other patients presented non-progressing disease course. Epilepsy was also present in adults, with an average age of onset of 35.07±10.62 years, and a generalised tonic-clonic seizure type was observed in all affected members. Clinical anticipation for tremor occurred in all mother–child pairs and two of six father–child pairs, with an average anticipation of 15 years in the former and 6 years in the latter. The similar phenomenon was noticed in individuals with epilepsy (online supplementary table 1). Thus an obvious anticipation and a higher degree in maternal transmission were observed. The WAIS test indicated no abnormalities in all affected individuals. Drawing test of Archimedes’ spiral and ladder showed mild to severe irregularity. The results of the chromosome karyotype analysis of the peripheral lymphocytes were normal. Brain MRI showed no obvious abnormality. Recorded SEPs indicated enhanced cortical components by stimulating the median nerve in bilateral wrist. sEMG of bilateral twist presented irregular, high-frequency, short bursts. Interval EEG and BEAM showed epileptiform discharges. These results suggested tremor was of cortical origin. Together, all the clinical features above indicated the definitive diagnosis of FCMTE according to the diagnosis criteria proposed by van Rootselaar et al.13 Detailed clinical information and representative results of clinical examinations of the family are  summarised in online supplementary table 2 and figure 1.

Previous genetic analysis failed to detect causal mutation in this family

By linkage analysis, two-point logarithum of the odds (LOD) score for all microsatellites in chromosome 8 was obtained (online supplementary table 3). A haplotype analysis was performed, and a shared haplotype by all affected individuals cosegregated with the phenotype (online supplementary figure 2). As a result, we mapped the disease locus of the family to chromosome 8q23.3–24.23 over 30.5 cM between D8S555 and D8S1753. This locus contained the reported causative region by Mori et al.7 However, we did not obtain positive results on structural variants from array-CGH in this region (online supplementary figure 3). Furthermore, NGS technologies including WES, TRS and WGS failed to identify any potential disease-causing mutations from the sequencing data, with standard variant filtering and analysis pipelines. We also analysed structural variants in WGS data, but again the results were negative in the linkage region.

LRS successfully detected the causal mutation

We initially performed WGS with ~10× coverage on the PacBio Sequel platform. The average read length is >10 kb. Mapping-based and assembly-based methods were conducted for calling SVs in a genome-wide scale. Candidate SVs were obtained for manual examination and further validation. However, no variant was identified or validated as likely causative mutation. Therefore, we next focused on the possible STR expansion within the linkage region. STR variations in the target region were detected by RepeatHMM, and the calls were filtered by our customised method called inScan. RepeatHMM detected 42 STR regions which had at least one read with STR expansion greater than or equal to 50 bp. Thirty-four of 42 STR regions had insertions detected in them by inScan. After sorting by median of size of insertions, we noticed that an STR region (TAAAA, chr8: 119379051–119379157, 21 repeats of unit) has the largest median insertion fragment size (3792 bp) with one supporting read (figure 2A, LR1). Then we manually inspected the reads covering this region. The characteristic repeat expansion region is covered by two out of six total reads. One read has a long clipping tail, which contains the expanded repeat units (figure 2A, LR2). Two roughly recognisable forms of repeat expansion included ‘AAAAT’ (corresponding to TTTTA repeat in intron 4 of the SAMD12 gene) and upstream ‘inserted’ TGAAA (corresponding to downstream TTTCA in the direction of transcription of the SAMD12 gene) (online supplementary figure 4). Detailed repeat structure cannot be inferred confidently because of the relatively high error rates in sequence data from the PacBio platform (online supplementary figure 4).

Figure 2

Illustration of the sequencing reads and validation of repeat expansions. (A) Reads covering the known short tandem repeat region. Six reads covering causative region could be seen in Integrative Genomics Viewer. LR1 and LR2 were reads which carried expansion: LR1 carried large insertion (chr8: 119379041–119379052) and was identified by our analytical strategy and is shown in red; LR2 has a long clipping tail which contains the expanded repeat units and is shown in blue. (B) Representative results of repeat primed-PCR analysis showing the existence of TTTTA repeat expansion (a) and TTTCA repeat expansion (c) in a patient and the negative results in a control (b and d). (C) Schematic representation of the causal mutation in SAMD12: a small number of TTTTA repeats exist in intron 4 of SAMD12 in healthy individuals, and large expanded TTTTA repeats and an ‘inserted’ expanded TTTCA repeats are present in affected individuals (shown in red).

We also performed WGS using the Oxford Nanopore platform. Two reads were found to span the expanded repeat region through analysing the alignment data. The structure of repeat expansion could be clearly recognised. However, variation of repeat length was observed in our data, which was also described in the Japanese study,10 suggesting the possible existence of additional somatic expansion (online supplementary figure 4). Nevertheless, the pattern of the expanded repeat was clearly recognised as ‘(TTTCA)exp(TTTTA)exp’, which is consistent with the reported configuration in the Japanese study.10

Variant verification

RP-PCR was used for detection of repeat expansion, and expansions of the pentanucleotide repeats ‘TTTCA’ and ‘TTTTA’ presented a typical sawtooth pattern (figure 2B,C). The results indicated that the TTTTA and ‘inserted’ TTTCA repeat expansions in SAMD12 were completely cosegregated with the FCMTE phenotype in our family. Subsequently, this mutation was also identified in three cases of two additional Chinese families from other institutions, further suggesting the causative role of the repeat expansions in FCMTE.

Discussion

FCMTE is a rare genetic condition that is reported worldwide but with a high prevalence in Japan.14 Clinically, it was a disease recognised early and characterised by cortical myoclonic tremor and infrequent epilepsy. The disease usually presents a benign course, although more severe seizures or intractable epilepsy were reported mainly in European patients.2 15 In addition to cortical myoclonic tremor and epilepsy, cognitive decline, mental retardation and migraine were also noticed in some pedigrees.2 3 15 Giant SEPs and long-latency cortical reflex detected in EMG were of diagnostic value in FCMTE. These findings indicated that it was a well-defined epilepsy syndrome with certain clinical heterogeneity. Our study confirmed the previous genetic findings from Japan and identified the pentanucleotide repeat expansion as the causative mutations for the disease.

Diagnosis criterion of the disease was proposed in 2005,13 and the typical clinical features of our families meet this criterion well. It was noteworthy that clinical anticipation, which was first noticed by Ikeda et al 16 and described in many other reports thereafter, existed in this disease.17 18 Clinical anticipation was well documented in spinocerebellar ataxias (SCAs) due to CAG repeat expansion. There is an inverse correlation between the number of CAG repetitions and the age of onset. In our pedigree, obvious anticipation with a higher degree in maternal transmission was also observed. This clinical finding suggested the expansion of STR should be considered in this disease.

The long history of searching causative mutation in the 8q24 locus suggested the complexity of disease mechanisms. In our previous work, we failed to identify any causative mutation in the linkage interval using multiple forms of NGS technologies or microarray technologies. In addition to our findings, Sanger sequencing of all coding genes in the disease locus by Mori et al 7 also failed to find any causative mutation. This indicated that NGS was insufficient to resolve the challenge of identifying causative genes in FCMTE1. As an alternative strategy, we used whole-genome low-coverage LRS to search for potential causative SVs. However, no structural variants were detected as the likely causative mutation in our initial analysis. The report of causative intronic repeat expansion in FCMTE10 helped us focus on STRs in the candidate linkage region with more detailed examination. RepeatHMM was an effective and efficient algorithm to find repeat expansions, but it may generate false-positives in LRS data.19 inScan was a custom-developed software that can be more powerful to detect and filter insertions. Combining these two complementary algorithms can be more effective to detect repeat expansion in STRs. Thus, we used this strategy and found the existence of characteristic repeat expansion including ‘TTTTA’ and ‘inserted’ TTTCA in our LRS data. RP-PCR confirmed the presence of this mutation and that the mutation cosegregated with the FCMTE phenotype in our family. Finally, this repeat expansion was also identified in three cases of two additional Chinese families affected with FCMTE, confirming its role in the disease pathogenesis.

The genetic diagnostic rate of Mendelian disorders is still relatively low, even with recent improvement in NGS technologies.20 It is believed that this is partially due to the presence of many long repetitive elements, copy number alterations and SVs that were relevant to the disease yet failed to be detected by conventional sequencing approaches.21 Although differentiating many of these complex elements was beyond the capacity of SRS-related technologies, the complementary strengths make LRS a promising approach. Recently, Liu et al 19 developed a novel algorithm to effectively detect repeat counts of microsatellites from LRS data. Causative SV was also identified in a patient using LRS.22 Furthermore, two causative repeat configurations in SAMD12 were accurately verified by Nanopore LRS.10 In our study, we developed a strategy combining RepeatHMM and inScan to detect this causative repeat expansion through LRS. These pieces of evidence indicated that LRS might be a powerful tool for genetic diagnosis of human diseases when conventional NGS failed to yield a positive diagnosis.

In summary, our study represents another documentation of intronic repeat expansion in SAMD12 in Chinese pedigrees affected with FCMTE, which further confirmed that this mutation causes FCMTE. Our study also suggested that LRS is an effective tool for molecular diagnosis of genetic disorders, especially for neurological diseases that cannot be diagnosed by conventional clinical microarray and NGS technologies.

Acknowledgments

We are indebted to all the patients and family members for their generous participation in this study.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.

Footnotes

  • KW, KX and B-sT contributed equally.

  • Contributors SZ performed the research, analysed the data and wrote the paper. M-yZ, J-wH, F-dS, X-jW, X-bD, J-fT, NL and X-mY provided the samples and assisted in clinical follow-up. Z-mH, J-cL, J-lW, FL, QY, QL and LF assisted in bioinformatics analysis. HJ, W-pL and J-yL supervised the study. KW, KX and B-sT designed the research, and provided financial and administrative support.

  • Funding This work was supported by the Program of National Natural Science Foundation of China (#81130021, #81430023 and #81300980).

  • Competing interests FL and QY are employees and KW is consultant of GrandOmics Biosciences.

  • Patient consent Obtained.

  • Ethics approval The study was approved by the Ethical Committee of Xiangya Hospital of Central South University in China (equivalent to an institutional review board).

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.