Background: Recent molecular studies of breakpoints of recurrent chromosome rearrangements revealed the role of genomic architecture in their formation. In particular, segmental duplications representing blocks of >1 kb with >90% sequence homology were shown to mediate non-allelic homologous recombination (NAHR). However, the occurrence of the majority of newly detected submicroscopic imbalances cannot be explained by the presence of segmental duplications. Therefore, further studies are needed to investigate whether architectural features other than segmental duplications mediate these rearrangements.
Methods: We analysed a series of patients with breakpoints clustering within chromosome band 5q35. Using high density arrays and subsequent quantitative polymerase chain reaction (qPCR), we characterised the breakpoints of four interstitial deletions (including one associated with an unbalanced paracentric inversion), a duplication and a familial reciprocal t(5;18)(q35;q22) translocation.
Results and conclusion: Five of the breakpoints were located within an interval of ∼265 kb encompassing the RANBP17 and TLX3 genes. This region is also targeted by the recurrent cryptic t(5;14)(q35;q32) translocation, which occurs in ∼20% of childhood T cell acute lymphoblastic leukaemia (T-ALL). In silico analysis indicated the architectural features most likely to contribute to the genomic instability of this region, which was supported by our molecular data. Of further interest, in two patients and the familial translocation, the delineated breakpoint regions encompassed highly homologous LINEs (long interspersed nuclear elements), suggesting that NAHR between these LINEs may have mediated these rearrangements.
Statistics from Altmetric.com
Chromosomal rearrangements such as deletions, duplications, translocations and inversions are a major cause of both constitutional and acquired human genetic disease. The finalisation of the human genome project and technological advances have resulted in an increasing rate of molecular characterisation of breakpoints of (predominantly recurrent) chromosomal aberrations. For many of the recurrent constitutional microdeletions and microduplications it was shown that breakpoint regions exhibit particular architectural features (that is, segmental duplications) that predispose to genomic instability. Clinical phenotypes caused by such genomic rearrangements that result in the gain, loss or disruption of a gene(s) for which dosage is critical were hence termed “genomic disorders”.1–3 Molecular characterisation of these genomic disorders has led to the identification of genes responsible for the associated phenotypes, thus contributing to the functional annotation of the human genome.
Despite this progress, the possible role of genome architecture in the formation of chromosomal rearrangement remains to be elucidated. Until now, two major mechanisms have been proposed, namely non-allelic homologous recombination (NAHR) and non-homologous end-joining (NHEJ). For NAHR, highly homologous sequences near the breakpoints such as segmental duplications (also known as low copy repeats or LCRs) serve as substrates. Typically such rearrangements are found in recurrent deletion/duplication syndromes.4 In contrast, for most non-recurrent rearrangements such homologous sequences are lacking and although NHEJ has been proposed as the preferred mechanism of aberration, the possible genomic architectural features mediating these rearrangements remain largely unknown. As genome scanning technologies are becoming more widely used and the number of analysed patients is growing steadily, new microdeletion/duplication syndromes are emerging for which breakpoints are not directly flanked by segmental duplications.5–10 Nevertheless, the clustering of breakpoints in these syndromes indicates that these specific regions are more prone to rearrangements, and therefore also implies the involvement of architectural features predisposing to genomic instability in which both NAHR and NHEJ might play a role.
In this study we investigated the apparent clustering of constitutional and acquired rearrangements at 5q35. To this purpose, we characterised the breakpoints of six constitutional aberrations (three interstitial deletions, one duplication, one unbalanced paracentric inversion with an interstitial deletion, and one reciprocal translocation) by high resolution array comparative genomic hybridisation (CGH) and quantitative polymerase chain reaction (qPCR) analysis.
Analysis of G-banded metaphase chromosomes was performed on short term lymphocyte cultures using standard procedures. Fluorescence in situ hybridisation (FISH) analyses were performed as described using bacterial artificial chromosome (BAC) clones provided by the Wellcome Trust Sanger Institute.11
Array CGH analysis for the translocation family 6 was performed on a 1 Mb BAC array as previously described.12 All other patients were hybridised on the Agilent Human Genome CGH Microarray 244K (patients 1, 2, 4, 5) or 105K (patient 3), according to the manufacturer’s instructions. In brief, 1 μg of genomic DNA was digested with AluI and RsaI for 2 h and subsequently labelled with Cy5 (patient) or Cy3 (control). After clean-up of the labelled fragments using Microcon YM-3 filter units (Millipore, Brussels, Belgium), they were pooled and 50 μl Cot-1 DNA (1 mg/ml), 10X blocking agent and 2X hybridisation buffer were added. This mixture was hybridised on the microarrays for 40 h at 65°C. After washing, the slides were scanned using an Agilent Microarray Scanner, quantified with Feature Extraction software 9.5.1 and analysed in CGH analytics 3.4.27 (Agilent, Diagem, Belgium). Data were stored and are accessible at http://medgen.ugent.be/arrayCGHbase/.
In silico analysis
“BLAST 2 Sequences” (http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi)13 were used to investigate the repeat content of the breakpoint region in relation to its surrounding sequences. A flanking region of 8 Mb (4 Mb on each side of the breakpoint region) was analysed. First, the blast output was filtered by discarding all self–self alignments. Next, for each single base the number of alignments was counted in which it occurs. This resulted in a set of 8 million data points. To reduce the amount of data, the region was divided into 100 kb windows and data points in these windows were averaged. This average frequency was then plotted to the corresponding genomic position.
The genomic DNA local alignment similarity search tool YASS (http://bioinfo.lifl.fr/yass/)14 allowed us to search for longer similarities between the breakpoint regions and provided positional information on the alignments through a dot plot.
Fine mapping of breakpoints
Breakpoint regions were fine mapped by iterative walking qPCR with primers closing in the breakpoint. Detection of copy number changes was performed as described.15 All primers were designed by Primer Express 2.0 software and evaluated in silico with RTPrimerDB (http://medgen.ugent.be/rtprimerdb/)16 17 which includes BLAST analysis, UCSC in silico PCR and secondary structure evaluation (mfold).
qPCR reaction mixtures contained 2x SYBR Green I buffer (Roche, Brussels, Belgium), 250 nM of both forward and reverse primer and 10 ng of template DNA in a total reaction volume of 7.5 μl. Each assay consisted of a no-template control, reference human genomic DNA, CLB-GA (a neuroblastoma cell line with 5q loss) DNA as positive control for deletion, and patient DNA. To account for possible variation related to DNA input amounts or the presence of PCR inhibitors, three reference amplicons (VHL, GPR15 and an intergenic sequence on 5q31.2) with normal copy number were simultaneously quantified for each patient sample. Amplification reactions were run on a Roche LC480 system with the following cycling conditions: 95°C for 5 min followed by 45 cycles of 95°C for 10 s, 60°C for 30 s, and 72°C for 1 s. After PCR amplification, a melting curve was generated for each amplicon by continuous heating from 60°C to 95°C to check the specificity of the PCR reaction. The obtained qPCR data were normalised and analysed using qBase (http://medgen.ugent.be/qbase/).18
Long range PCR
In an attempt to sequence the breakpoints, long range PCR was performed using the iProof High-Fidelity PCR Kit (Bio-Rad, Nazareth, Belgium) according to the manufacturer’s instructions. In brief, PCR reaction mixtures contained 5x iProof HF buffer, 200 μM each dNTP, 0.5 μM of both forward and reverse primer, 0.02 U/μl iProof DNA polymerase, and 10 ng template DNA. The following cycling conditions were performed on a PTC-200 thermal cycler: 98°C for 40 s followed by 35 cycles of 98°C for 10 s, 60°C for 20 s and 72°C for 5 min. Final extension was performed at 72°C for 10 min. PCR products were run on a 1% agarose gel to verify if the reaction was successful before proceeding to the next step.
In total, six constitutional aberrations were investigated at the molecular level, five of which have been previously reported at low resolution.12 19–23 This study was approved by the ethics committee of Ghent University Hospital and appropriate informed consent was obtained from patients. An overview of the main clinical findings and karyotypes is given in table 1.
In order to search for the cause of the apparent clustering of chromosome 5q35 rearrangements in five patients with copy number alterations, molecular characterisation of breakpoints was performed using the Agilent 105K (patient 3) and 244K (patients 1, 2, 4 and 5) oligo-array CGH platform. In addition, we fine mapped the breakpoints in the t(5;18) translocation family to 5q35.2 and 18q22.3 using a 1 Mb BAC array on carriers of the unbalanced translocation. Raw array CGH data are available at http://medgen.ugent.be/arrayCGHbase/(experiments, project Buysse et al). FISH analysis in a balanced carrier revealed a split signal for full-tiling path clones RP11-417I20 on 5q35.2 and RP11-117L4 on 18q22.3 (result not shown).
Figure 1A shows the breakpoint positions, which could be mapped with a resolution of 8 to 41 kb (25 kb on average). All breakpoints at 5q35.1 were located within a region of ∼265 kb (170.448–170.714 Mb; NCBI build 36.1) which includes the 3′ end of the RANBP17 gene and the entire TLX3 gene (T-cell leukaemia homeobox 3 gene) (fig 1B). The proximal breakpoints in patients 1, 3, 4 and 5 disrupt RANBP17 while the distal breakpoint of the second patient lies 70 kb downstream of this gene and 56 kb downstream of TLX3. The proximal breakpoint of patient 3 overlaps with those of patients 1 and 4. Moreover, the distal breakpoints of patients 1 and 3 overlap (fig 1A,B).
Interestingly, the proximal breakpoint cluster region is also involved in the cryptic t(5;14)(q35;q14) translocation which occurs in ∼20% of patients with childhood T cell acute lymphoblastic leukaemia (T-ALL).24 This recurrent rearrangement juxtaposes the RANBP17-TLX3 region on chromosome 5 to the BCL11B locus on chromosome 14. Comparison of the positions of previously mapped t(5;14) breakpoints25–28 with the constitutional 5q35.1 breakpoints characterised in this study showed breakpoint clustering within the ∼265 kb segment (fig 1B).
The observed co-localisation of breakpoints in both constitutional and acquired rearrangements is suggestive of an underlying genomic cause leading to instability. As a first step to evaluate this hypothesis, the breakpoint regions were investigated for the presence of known segmental duplications (>1 kb in length with >90% similarity29) using the UCSC Genome Browser (http://genome.ucsc.edu/cgi-bin/hgGateway) (fig 1B). Only interchromosomal segmental duplications were found, excluding the involvement of NAHR between LCRs as recombination mechanism. The next step was to investigate whether the breakpoint region is enriched in other repeats in comparison with its surrounding sequence. A “BLAST 2 Sequences” analysis (see Methods) showed that the breakpoint region contains a higher number of alignments, as demonstrated by the steep increase in the cumulative curve (black line) around 170.5 Mb (fig 2).
To investigate the repeat content of the individual rearrangements in further detail, high resolution mapping by qPCR was performed for proximal and distal breakpoint regions in all five patients and the familial translocation. Figure 3 gives a representative example of the fine mapping of the proximal breakpoint in patients 1 and 3. Table 2 gives an overview of the size of the delineated breakpoint regions together with the percentage and type of repeats as determined by RepeatMasker software (http://www.repeatmasker.org/). Refinement was achieved to the point where the repetitive nature of the breakpoint regions hampered primer design. Interestingly, both proximal and distal breakpoint regions in patients 1 and 3 were identical. Moreover, the distal 8.5 kb breakpoint of these patients coincided with the 5q35 familial translocation breakpoint (fig 1A). YASS analysis of the proximal and the distal breakpoint regions of patients 1 and 3 revealed an alignment of 6 kb with 96.7% sequence similarity (fig 4A). Upon evaluation in RepeatMasker, the respective 6 kb sequences were identified as LINEs (long interspersed nuclear elements) of the L1PA2 subfamily at the proximal breakpoint and of the L1HS subfamily at the distal breakpoint (fig 1C,D). Furthermore, the reciprocal 18q22.3 breakpoint region from the familial translocation encompassed a 6 kb L1HS LINE element with 98.5% identity to the L1HS LINE in the 5q35.2 breakpoint. The YASS dot plot for this alignment is shown in fig 4B. Unfortunately, long range PCR to sequence the breakpoints failed, even in the duplication patient, probably due to the presence of these repeats. This problem has also been encountered in similar breakpoint sequencing projects.
In this study we have mapped breakpoints in patients with interstitial deletions, a duplication, a paracentric inversion and a familial translocation involving chromosome band 5q35. Five of these constitutional breakpoints map within a ∼265 kb region on 5q35.1 (170.448–170.714 Mb). This region seems to be particularly prone to rearrangements as it also comprises acquired breakpoints of the recurrent t(5;14)(q35;q14) translocation in T-ALL (fig 1B).25–28 In search of structural elements that may trigger the formation of these rearrangements, in silico analyses were performed with focus on interspersed repeats. The “BLAST 2 Sequences” analysis indicated an enrichment of alignments in and close to the breakpoint region in comparison with the surrounding sequences. This supports the possible involvement of architectural features that confer genomic instability to the region, leading to a higher percentage of double strand breaks. The fact that eight out of 12 breakpoint positions could not be refined by qPCR due to the high number of interspersed repeats and long range PCR being unsuccessful further strengthens this hypothesis. Taken together, our data suggest that the co-occurrence of constitutional and acquired breakpoints is mediated by underlying architectural features leading to genomic instability. A similar observation was recently made for the chromosomal region on Xp11.23. The SSX1 gene involved in recurrent t(X;18)(p11.2;q11.2) translocations in synovial sarcomas30 was also shown to be implicated in constitutional aberrations31 32 and small copy number variations (CNVs) (http://projects.tcag.ca/variation/). Co-localisation of constitutional and acquired breakpoints also occurs at 16p13.3. Constitutional microdeletions and translocations involving the CREBBP gene cause Rubinstein–Taybi syndrome,33–36 while t(8;16)(p11;p13) and t(11;16)(q23;p13) translocations are associated with an infrequent form of acute myeloid leukaemia.37–39 These data suggest that similar mechanisms mediate the formation of constitutional and certain acquired rearrangements and that genomic architecture plays an essential role in this process.
Following high resolution qPCR analysis, we identified an identical deletion in patients 1 and 3 with proximal and distal breakpoints located within ∼8 kb segments containing 6 kb highly homologous LINEs of two different subfamilies (L1PA2 and L1HS, respectively, for the proximal and distal segment). The distal deletion breakpoint also coincided with the translocation breakpoint in a family with a constitutional t(5;18)(q35.2;q22.3) translocation. Moreover, the reciprocal translocation breakpoint region at 18q22.3 encompassed a similar 6 kb LINE of the L1HS subfamily. We therefore assume that more than 96% sequence identity over a region of 6 kb most likely induced non-allelic homologous recombination in these patients. However, attempts to sequence the breakpoints in order to confirm this hypothesis failed. Complex genomic architecture hampering breakpoint delineation has also been observed in the MECP2 region at Xq28.40 Moreover, Woodward et al mentioned that attempts to find breakpoint junctions by long range PCR failed for unknown reasons in eight patients carrying a PLP1 duplication.41 This illustrates that cloning and sequencing of such repeat-rich breakpoints remains challenging and technological improvements will be necessary to achieve this goal. Recently, Lee et al proposed a replication based mechanism termed FoSTeS (Fork Stalling and Template Switching) to explain complex duplication rearrangements associated with Pelizaeus–Merzbacher disease.42 We cannot exclude that this or other replication based mechanisms such as break induced replication (BIR) are responsible for the deletions in our patients, with the LINE-1’s serving as regions of microhomology.43
The majority of newly detected structural rearrangements cannot be explained by non-allelic homologous recombination (NAHR) between segmental duplications.
We observed a clustering of both constitutional and acquired rearrangements in a ∼265 kb interval at 5q35. In order to elucidate the mechanism of rearrangement, we performed in silico and molecular characterisation of the breakpoints of four deletions, one duplication and one translocation.
Our data suggest that architectural features confer genomic instability on this region. In particular, NAHR between highly homologous (>96%) LINE-1 elements seems to have mediated the rearrangement in at least three of our patients.
In summary, we observed clustering of constitutional and acquired breakpoints within a ∼265 kb segment on chromosome 5q35.1. Our in silico and molecular data suggested that architectural features confer genomic instability, leading to a higher incidence of breaks. Moreover, we identified an identical deletion in two patients as well as a familial translocation with breakpoint regions comprising highly homologous 6 kb LINE-1 elements, suggestive of NAHR as the mechanism of formation. To elucidate the particular role of LINEs in the occurrence of rearrangements, molecular characterisation of aberrations not flanked by segmental duplications seems therefore necessary. New research strategies may arise from our findings, such as screening of recurrent tumour specific breakpoints for the presence of constitutional anomalies or targeted search for aberrations in genomic regions in which NAHR between homologous LINEs is possible. It is our contention that such targeted screenings will lead to the discovery of yet undetected recurrent constitutional rearrangements.
We are grateful to the patients and their family for their cooperation. We thank Amélie Dendooven for assistance with qPCR experiments and James Lupski for critically reading the manuscript. We would like to thank the Mapping Core and Map Finishing groups of the Wellcome Trust Sanger Institute for initial clone supply and verification.
Funding: This work was made possible by grants G.0200.03 from the FWO and GOA-grant 12051203 from Ghent University. Karen Buysse and An Crepel are Research Assistants of the Research Foundation - Flanders (FWO - Vlaanderen). Filip Pattyn is supported by the Vlaamse Liga tegen Kanker through a grant of the Stichting Emmanuel van der Schueren. This study is supported, in part, by the Fund for Scientific Research, Flanders, with a mandate fundamental clinical research to Geert R Mortier and Koen Devriendt. This text presents research results of the Belgian program of Interuniversity Poles of attraction initiated by the Belgian State, Prime Minister’s Office, Science Policy Programming (IUAP). Wilhelm Johannsen Centre for Functional Genome Research is established by the Danish National Research Foundation.
Competing interests: None declared.
Patient consent: Obtained.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.