Background Apparently balanced chromosomal rearrangements (ABCR) are associated with an abnormal phenotype in 6% of cases. This may be due to cryptic genomic imbalances or to the disruption of genes at the breakpoint. However, breakpoint cloning using conventional methods (ie, fluorescent in situ hybridisation (FISH), Southern blot) is often laborious and time consuming. In this work, we used next generation sequencing (NGS) to locate breakpoints at the molecular level in four patients with multiple congenital abnormalities and/or intellectual deficiency (MCA/ID) who were carrying ABCR (one translocation, one complex chromosomal rearrangement and two inversions), which corresponded to nine breakpoints.
Methods Genomic imbalance was previously excluded by array comparative genomic hybridisation (CGH) in all four patients. Whole genome paired-end protocol was used to identify breakpoints. The results were verified by FISH and by PCR with Sanger sequencing.
Results We were able to map all nine breakpoints. NGS revealed an additional breakpoint due to a cryptic inversion at a breakpoint junction in one patient. Nine of 10 breakpoints occurred in repetitive elements and five genes were disrupted in their intronic sequence (TCF4, SHANK2, PPFIA1, RAB19, KCNQ1).
Conclusions NGS is a powerful tool allowing rapid breakpoint cloning of ABCR at the molecular level. We showed that in three out of four patients, gene disruption could account for the phenotype, allowing adapted genetic counselling and stopping unnecessary investigations. We propose that patients carrying ABCR with an abnormal phenotype should be explored systematically by NGS once a genomic imbalance has been excluded by array CGH.
- Next Generation Sequencing
- Chromosome rearrangement
- Intellectual deficiency
Statistics from Altmetric.com
Apparently balanced chromosomal rearrangements (ABCR) occur in 1.54‰ of live births1 and have usually no phenotypic consequence for the carrier. However, in some cases they may be associated with an abnormal phenotype, that is, multiple congenital abnormalities and/or intellectual deficiency (MCA/ID). This risk has been estimated to be at 6% of apparently balanced translocations and inversions2 and increases with the number of breakpoints, as in the case of complex chromosomal rearrangements (CCR, rearrangements involving three or more breakpoints).3 Systematic studies of ABCR with abnormal phenotype by array comparative genomic hybridisation (CGH) showed that the phenotype might occur due to genomic imbalances near or far from breakpoints.4–6 A recent meta-analysis estimated that 37% of two-breakpoint rearrangements and 90% of CCR were unbalanced.7 In other cases, the breakpoints are supposed to disturb gene expression either by gene disruption,8 ,9 position effect10–12 or disturbance of parental imprinting.13 However, these hypotheses remain rarely investigated since standard methods for breakpoint cloning such as fluorescent in situ hybridisation (FISH) with bacterial artificial chromosome (BAC) clones, Southern blot, inverse-PCR or long range PCR are laborious, time consuming, and may not be precise enough.14 Without breakpoint cloning, it is not possible to conclude if the phenotype is due to the ABCR or is coincidental. Thus, genetic counselling would be difficult or inappropriate.
Recently, next generation sequencing (NGS) technology was applied successfully to characterise genome structural variations.15 Then it was applied to characterise translocations and inversions breakpoints in patients with abnormal phenotypes. Several teams, using different DNA preparation protocols and sequencing chemistry, demonstrated that NGS was successful at mapping breakpoints in such rearrangements at the base pair level. This provided information about their mechanisms and thus allowed the identification of candidate genes disrupted at the breakpoint.16–21
In this study, we used paired-end whole genome sequencing to characterise the breakpoints in four patients carrying ABCR with abnormal phenotypes and for which previous array CGH did not show any genomic imbalance. We mapped 10 breakpoints, five of which disrupted a known gene. Among these, three could be directly related to the patient's phenotype, providing helpful information for genetic counselling.
Patients and methods
Patients: clinical report
The patients carried ABCR associated with MCA/ID. They were part of a previous study that showed no genomic imbalance by 244 K oligonucleotides array CGH (Agilent Technologies, Santa Clara, California, USA).6 Informed consents were collected before the study. The clinical phenotype and cytogenetic findings are detailed below.
Patient 1 (case 2 in Schluth-Bolard et al6)
The patient was a 15-month-old girl, the second child of unrelated parents with no family history. She was born at 40 weeks gestation by caesarean section after a premature rupture of the membrane. Birth parameters were normal: birth weight 3480 g (median), birth length 51 cm (median), and occipito-frontal circumference (OFC) 33 cm (−1 SD). She developed postnatal microcephaly (OFC −3 SD) with normal growth (weight and length on median), hypotonia and severe developmental delay (sitting at 13 months, babbling at 14 months). Physical examination revealed only a right single palmar crease. Screening for mutation in the MECP2 gene was negative. Standard blood karyotype showed a de novo apparently balanced translocation between chromosomes 1 and 18: 46,XX,t(1;18)(p36;q21)dn.
Patient 2 (case 41 in Schluth-Bolard et al6)
The patient was a 3.5-year-old boy, the unique child of unrelated parents with no family history. He was born at term with normal birth parameters. He presented moderate motor delay (walking at 19 months) and an important speech delay (first words at 3 years). At the age of 3.5 years, clinical examination revealed facial dysmorphism (large forehead, bilateral epicanthic folds, enophthalmia, anteverted nostrils, micrognathia), clinodactyly of the fifth fingers, normal growth parameters (height 1 m, +1.5 SD; weight 18 kg, +2.5 SD; OFC 53 cm, +2.5 SD). Behavioural problems were noticed including hypersalivation, intolerance to frustration, hyperactivity, and autistic behaviour. Standard blood karyotype revealed a de novo CCR involving chromosomes 1, 7 and 11: 46,XY,t(1;7;11)(p35;q33;q12)dn.
Patient 3 (patient 32 in Schluth-Bolard et al6)
The patient was a 3-year-old girl, the second child of unrelated parents. She was born at term after a normal pregnancy. She was referred for speech delay, learning difficulties concerning handwriting and space orientation, and obstructive behaviour. Clinical examination was normal. An electroencephalogram (EEG) showed infra-clinic epilepsy and the patient was treated with valproic acid and methylphenidate. FRAXA CGG expansion screening was negative. Standard blood karyotype showed a paracentric inversion of the long arm of chromosome 8: 46,XX,inv(8)(q21q24.2)pat. This rearrangement was inherited from her father who presented a less severe phenotype with learning difficulties but no epilepsy during childhood. Karyotypes from the first healthy child, paternal grandmother and paternal grandfather were normal. No other family member presented with a similar phenotype.
Patient 4 (patient 30 in Schluth-Bolard et al6)
The proband was a male fetus. It was the first pregnancy of a 26-year-old healthy woman with no family history. The pregnancy was terminated at 15 weeks and 6 days of gestation because of a severe polymalformative syndrome. Pathological examination documented a male fetus with a thick neck, a large fontanelle, low-set ears, hypertelorism, posterior palatal cleft, short humerus and femurs, omphalocele, urethral atresia, macrogonadism, and placentomegaly. Histological examination revealed adrenocortical cytomegaly, Leydig cells hyperplasia, and placental mesenchymal dysplasia. These signs were in favour of a severe form of Beckwith–Wiedemann syndrome (BWS). Methylation studies of both imprinting centres of the 11p15 region (IC1 and IC2) were normal, and no mutation of CDKN1C was detected. The fetal karyotype revealed a pericentric inversion of chromosome 11: 46,XY,inv(11)(p15q13)mat, that was inherited from the mother. During the second pregnancy, fetal sonography revealed the same severe phenotype. A female baby was born preterm at 30 weeks’ gestation and died at 7 days of life. She had also inherited the same inversion from the mother.
Next generation sequencing
Breakpoint detection was based on whole genome sequencing with paired-end protocol and specific bio-informatic analysis (Integragen, Evry, France).
Genomic libraries were prepared following the Illumina TruSeq protocol (Illumina, San Diego, California, USA). Briefly, 3 µg of each genomic DNA were fragmented by sonication and purified to yield fragments of 400–500 bp. Paired-end adaptor oligonucleotides from Illumina were ligated on repaired A tailed fragments, then purified and enriched by PCR cycles. Each DNA library was then sequenced on an Illumina HiSEQ 2000 as paired-end 100 bp reads. Image analysis and base calling was performed using Illumina Real Time Analysis Pipeline V.1.9 with default parameters.
The bioinformatics analysis of sequencing data was based on the Illumina pipeline (CASAVA1.8). CASAVA performed multiseed and gapped alignments on reference human genome hg19. Sequences with more than two mismatches were excluded, as well as duplicated sequences corresponding to PCR amplification bias. Then, from the alignment, a list of reads not mapped with a nominal distance and orientation from each other was retained: pairs of reads with abnormal orientation for inversions and pairs of reads mapped on different chromosomes for translocations. Finally, only abnormalities supported by at least five independent pairs of read were verified. If this scheme analysis was not sufficient to identify the breakpoints, six mismatches per sequence were tolerated (patients 2 and 4). Analysis focused on chromosomes involved in the rearrangement.
PCR amplification and Sanger sequencing of junction fragments
Primer pairs were selected on each side of the breakpoint region delimited by NGS (primers sequence available on request). Junction fragments were amplified using the AmpliTaq Gold kit (Applied Biosystem, Foster City, California, USA) according to the following protocol: 80 ng DNA was mixed with 2.5 mM MgCl2, 0.2 mM dNTP mix, 0.5 µM forward primer, 0.5 µM reverse primer in a final volume of 50 µl, and incubated with an initial denaturation for 10 min at 96°C, followed by 35 cycles of denaturation for 1 min at 96°C, hybridisation for 1 min at 60°C, and elongation for 1 min at 72°C, with a final elongation phase for 10 min at 72°C. DNA from a control that was not a carrier of chromosomal rearrangement was amplified in the same time as the patient as a negative control. DNAs were also amplified with the primer pair for the MLL2 gene (exons 24-25) as positive control. PCR products were verified on 2% agarose gel (Invitrogen, Life Technologies, Paisley, UK). Then specific products corresponding to the junction fragment were sequenced by the Sanger method (Genoscreen, Lille, France).
BAC clones spanning the breakpoint region as defined by the NGS results were selected through the UCSC (University of California, Santa Cruz) genome browser. They were either commercially available (RP11-154H17, RP11-7L24, RP1-224A6, RP11-102B19, RP11-366J15, RP11-113L16, RP11-7H15) (BlueFish, BlueGnome, Cambridge, UK) or were FITC (fluorescein isothiocyanate) or TRITC (tetramethyl rhodamine isothiocyanate) labelled by nick-translation (RP11-116D18, RP11-48B3) and hybridised on metaphase spread with appropriate control probe as previously described.22
For patient 1, NGS yielded an 11.9X physical coverage (figure 1). The translocation t(1;18) was represented by five pairs of reads, four corresponding to the derivative (der) (1) and one corresponding to the der(18). It made it possible to delineate the breakpoint on chromosome 1 in a 179 bp region at 1p36.31 (chr1:5 737 217–5 737 396) and the chromosome 18 breakpoint in a 324 bp region at 18q21.2 (chr18: 53 056 082–53 056 406). The breakpoints were consistent with cytogenetics findings. Breakpoints were verified by FISH using RP11-156H17 (1p36.31) and RP11-7L24 (18q21) BAC clones. For each probe it showed a split signal on either derivative chromosome. Sanger sequencing of junction fragments identified the breakpoint on chromosome 1 between position chr1:5 737 385–5 737 386 and breakpoint on chromosome 18 between position chr18:53 056 346–53 056 347. Both breakpoints were located in repetitive elements. The rearrangement generated a C base gain on der(18) and disrupted the TCF4 gene (intron 6) on 18q21. No gene was disrupted on der(1).
For patient 2, NGS yielded an 11.8X physical coverage (see online supplementary figure S1). Eighteen aberrant pairs of reads were found: 12 corresponding to der(1), four corresponding to der(7), and two corresponding to der(11). Moreover, it displayed three additional abnormally oriented pairs of read, corresponding to a 180 kb inversion of chromosome 11 occurring on der(1) breakpoint, not visible on karyotype. It allowed us to delineate the breakpoint in a 276 bp region on chromosome 1 (chr1: 22 441 328–22 441 603), in a 184 bp region on chromosome 7 (chr7: 140 121 612–140 121 795), in a 302 bp region on chromosome 11 (translocation) (chr11: 70 176 760–70 177 061), and in a 90 bp region on chromosome 11 (inversion) (chr11: 70 359 938–70 360 027). Breakpoints were verified by FISH using RP1-224A6 (1p36.32), RP11-366J15 (7q34), RP11-102B19 (11q13.3), and RP11-113L16 (11q13.3) BAC clones. They all showed a split signal confirming the NGS result except for RP11-102B19 that was uniquely located on derivative 1. This result was consistent with a small inversion event that could not be resolved by FISH. Sanger sequencing of junction fragments identified the breakpoint on chromosome 1 between position chr1:22 441 561–22 441 564, the breakpoint on chromosome 7 between position chr7:140 121 762–140 121 766, the breakpoint on chromosome 11 between position chr11:70 177 030–70 177 031, and the inversion breakpoint on chromosome 11 between position chr11:70 360 035–70 360 036. Three of these breakpoints were located in repetitive elements. The rearrangement generated a 2 bp deletion on chromosome 1 (TC), a 3 bp deletion on chromosome 7 (GAT), and a 25 bp gain on der(7), from which 20 bp came from an intergenic sequence of chromosome 18 lying at 11 bp from a LINE element (see online supplementary figure S2). This complex rearrangement disrupted three genes: RAB19 (intron 3) in 7q34, PPFIA1 (intron 8) in 11q13.3, and SHANK2 (intron 14) in 11q13.3.
For patient 3, NGS yielded a 12.7X physical coverage and identified 12 abnormal pairs of reads corresponding to the inversion, five corresponding to the proximal breakpoint, and seven corresponding to the distal breakpoint (see online supplementary figure S1). This made it possible to delineate the breakpoints in a 110 bp region in 8q21.13 (chr8:81 388 743–81 388 852) and in a 119 bp region in 8q24.3 (chr8:142 993 498–142 993 616). Only the proximal breakpoint could be verified by FISH using RP11-48B3 (8q21.13) BAC clone, since there was no clone available for the distal breakpoint. Sanger sequencing of junctions fragments identified the proximal breakpoint between position chr8:81 388 849–81 388 853 and the distal breakpoint between position chr8:142 993 610–142 993 617 (see online supplementary figure S2). The proximal breakpoint occurred in a zone of 4 bp microhomology and resulted in a 6 bp deletion, a GAT deletion in 8q24, and a GGG deletion either on 8q21 or 8q24 (see online supplementary figure S2). Both breakpoints lay in repetitive elements. No gene was disrupted by the rearrangement.
For patient 4, NGS yielded an 11.1X physical coverage and identified six read pairs abnormally oriented, corresponding to the proximal breakpoint of the chromosome 11 inversion (see online supplementary figure S1). This allowed us to localise the proximal breakpoint in a 400 bp region in 11p15.4 (chr11:2 850 661–2 851 061) and the distal breakpoint in a 400 bp region in 11q13.3 (chr11:68 755 325–68 755 725). Breakpoints were verified by FISH using RP11-116D18 (11p15.4) and RP11-7H15 (11q13.3) BAC clones. Sanger sequencing of junction fragments defined the proximal breakpoint between position chr11:2 850 769–2 850 770 and the distal breakpoint between position chr11: 68 755 472–68 755 475 (see online supplementary figure S2). Both breakpoints lay within repetitive elements. The rearrangement generated a 12 bp gain of an unknown origin on 11p15.4 and a 2 bp deletion (AG) on 11q13.3 (see online supplementary figure S2). It disrupted the KCNQ1 gene (intron 15) in the BWS region.
In summary, NGS identified all nine breakpoints at the molecular level in intervals ranging from 90 to 400 bp and allowed us to uncover an additional cryptic rearrangement not visible on karyotype. These breakpoints were confirmed by FISH when probes were available and were fully characterised at the molecular level by Sanger sequencing. Nine out of 10 breakpoints occurred in repetitive sequences and five genes were disrupted (table 1).
We used next generation paired-end sequencing to characterise breakpoints of four ABCR at the molecular level. Contrary to FISH, whose resolution may be insufficient to conclude gene disruption,10 ,23 this technique is a rapid and precise way to map breakpoints of chromosomal rearrangements. In this study, it defined the breakpoints in hundreds of base pairs intervals in a single step, which was sufficient to conclude gene disruption. Moreover, NGS appears to be a reliable method since we confirmed the results by two independent techniques, FISH and PCR followed by Sanger sequencing. Previous studies used different protocols for DNA library preparations, including chromosome sorting,16 whole genome paired-end sequencing,20 mate-pair library,17 ,18 ,20 custom jumping library20 or capture of breakpoints.19 ,20 They all proved to be efficient. However, the yield of the capture method, that needs information from former molecular cytogenetic cloning, was inferior. This could be explained by the fact that repeated sequences, where breakpoints are often located, are underrepresented in the capture kit.20 Karyotype remains important for the interpretation of NGS data in ABCR breakpoint cloning. In this study, we focused on the chromosomes involved in the rearrangements and did not analyse other structural variants in the genome.15
The combination of NGS and Sanger sequencing made it possible to define the breakpoints at the base pair level and provided the opportunity to understand better the mechanisms of chromosomal rearrangements. Breakpoints occurred in repetitive sequences in nine out of 10 cases. Two previous larger studies showed that involvement of repetitive elements occurred in 44% (8/18)14 and 39% (55/141)24 of breakpoints and is not uncommon. Lack of homology and the presence of small gain or loss of nucleotides are in favour of non-homologous end-joining mechanism (patients 1, 2 and 4).25 In patient 3, the presence of a four base pair microhomology could argue in favour of a replication based mechanism such as fork stalling and template switching (Fostes)26 or micro-homology mediated break induced replication.27 Moreover, in patient 2, NGS uncovered additional complexity, revealing a cryptic 180 kb inversion of chromosome 11 at the junction of der(1). A recent study estimated that 19.2% of two-breakpoints ABCR involved in fact three or more breakpoints and that many of them were associated with an inverted segment at the breakpoint junction.24
Breakpoint mapping of ABCR is a way to identify new candidate genes in MCA/ID. Studies of large cohorts of ABCR patients by FISH estimated that gene disruption occurred in at least 45–52% of patients with abnormal phenotypes.10 ,23 This result was supported by the study of 38 patients presenting with ABCR and autism spectrum disorder or neurodevelopmental disorder (ASD/NDD) by NGS which identified gene disruption in 83% of the cases.21 Among the disrupted gene, 58% could be related to the patients’ phenotype. In the present work, disruption of a known OMIM gene was considered to be responsible for the patient phenotype since the separation of the 5′ and 3′ parts of the gene would prevent its transcription (patients 1 and 2). If the disrupted gene could not account for the phenotype or if no gene was disrupted, position effect on neighbouring genes was studied using literature and genome database information (patient 3 and 4). Validation study by reverse transcriptase quantitative PCR (RT-qPCR) to evaluate mRNA level could not be performed in any of the patients. Gene disruption was observed in three out of four patients representing 50% of breakpoints and involving five genes: TCF4 (patient 1), SHANK2, PPFIA1, RAB19 (patient 2) and KCNQ1 (patient 4). At least three genes could account for the phenotype.
In patient 1, the TCF4 gene, which encodes a helix loop helix transcription factor (MIM 602272) highly expressed during development in the central nervous system and other tissues, was disrupted. Haploinsufficiency of TCF4 is responsible for the Pitt–Hopkins syndrome (MIM 610954).28 This syndrome is characterised by constant and severe ID, absent language, hypotonia, stereotypic movements, a smiling appearance, hyperventilation, facial dysmorphism (deep set eyes, midface protrusion, large mouth), single palmar crease, strabismus, seizures, and abnormal brain imaging,29 consistent with the phenotype of patient 1. Although microcephaly is not a typical feature of Pitt–Hopkins syndrome, it has been described in 7% of cases.29 This syndrome is secondary to deletions or point mutations of the TCF4 gene in most of cases,28 but TCF4 disruption by translocation breakpoints has also been documented.30 ,31
Among the three genes disrupted in patient 2, SHANK2 (MIM 603290) codes for a scaffolding protein localised at the post-synaptic sites of glutamatergic synapses and belongs to the same family as SHANK3 (MIM 606230) involved in Phelan–MacDermid syndrome (MIM 606232). Recently it has been proposed that SHANK2 might act as a susceptibility gene for ASD. Indeed, SHANK2 de novo deletions have been identified in patients with ID/ASD.32 ,33 SHANK2 variants, affecting conserved amino acids and associated with synapse density alteration, are also significantly more frequent in patients with ASD.33 Here we describe the first case of SHANK2 disruption by translocation breakpoint associated with ID/ASD. It has also been suggested that alteration of SHANK2 could act in an epistatic manner with other loci to induce ASD.33 Interestingly, among the two other disrupted genes, PPFIA1 (MIM 611054) codes for a protein belonging to the liprin-α family that may play a role in cell–matrix interactions, particularly during synaptic formation and function,34 ,35 and may participate in the patient phenotype. The role of the RAB19 gene that codes for a small GTP binding protein from the ras oncogene family is unclear.36
Patient 4 was diagnosed with BWS (MIM 130650). It is an overgrowth disorder clinically characterised by macrosomia, facial dysmorphism with macroglossia, ear lobe creases, omphalocele, visceromegaly, adrenocortical cytomegaly, hemihyperplasia, and an increased risk of embryonal tumour, especially Wilms tumour.37 It may also be associated with placental mesenchymal dysplasia.38 This syndrome is due to a variety of genetic and/or epigenetic alterations resulting in expression deregulation of imprinted genes in the 11p15.5 region (IGF2/H19 and KCNQ1OT1/CDKN1C). Maternally transmitted inversions or translocations are described in <1% of cases but the exact mechanism leading to WBS phenotype is not yet clearly understood. In these cases, a decreased expression level of the maternally expressed CDKN1C gene has been observed, without the methylation anomaly of the imprinting IC2.39 Recently an enhancer model for control of the CDKN1C locus by IC2 has been proposed.40 Methylation of IC2 on the maternal allele would prevent insulator formation and allow a distant enhancer, located between exon 3 and exon 15 of KCNQ1 gene, to activate maternal CDKN1C expression. According to this model, the proximal breakpoint of patient 4, located in intron 15 of KCNQ1, would separate CDKN1C from its distant enhancer and alter its expression, thus contributing to the phenotype.
In patient 3, no gene was disrupted. However, the position effect on neighbouring genes up to 1 Mb has already been described.11 In this case, neighbouring regions contained 34 genes. Among them, ARC, HEY1, JRK, PTK2 and STMN2 may play a role in central nervous system development and function, but none of them has been reported to be responsible for disease in man. Further functional studies would be necessary to verify this hypothesis.
In conclusion, NGS is a powerful tool allowing rapid breakpoint cloning of ABCR at the molecular level. It will not only contribute to the understanding of the mechanism of ABCR and the identification of candidate genes in MCA/ID, but it will also improve the genetic management of patients carrying ABCR with abnormal phenotypes. We showed that in three out of four patients, gene disruption could account for the phenotype, allowing adapted genetic counselling and stopping unnecessary investigations. We also confirmed results from previous studies.21 So, we propose that patients carrying ABCR with an abnormal phenotype should be explored systematically by NGS once a genomic imbalance had been excluded by array CGH. It will allow the diagnosis to be confirmed in both the specific recognisable syndrome and in the non-specific MCA/ID phenotype. A large cohort study is necessary to confirm the clinical efficiency of this approach.
We would like to thank the families for their kind participation and for their continued interest in this study, Andra Postu for her kind proofreading, and Mélanie Letexier for her help in bio-informatic analysis.
Contributors CSB and DS designed the study, analysed all data and are guarantors. CSB performed part of the FISH and PCR experiments and wrote the paper. AL, GL, NBK analysed the data. MPC, MT, GN, HT, PE were involved in the patients’ care and evaluation. DR performed part of the FISH and PCR experiments. ED performed part of the FISH experiments. SR performed molecular analysis of 11p15 region (methylation studies and CDKN1C mutation screening).
Funding This work was supported by the GIS—Institute for Rare Diseases (France) grant number AMA11020CSA.
Competing interests None.
Ethics approval Ethical committee Lyon Est France.
Provenance and peer review Not commissioned; externally peer reviewed
Internet resources UCSC genome browser: http://genome.ucsc.edu/cgi-bin/hgGateway.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.