Background Localisation of the breakpoints of chromosomal translocations has aided the discovery of several disease genes but has traditionally required laborious investigation of chromosomes by fluorescent in situ hybridisation approaches. Here, a strategy that utilises genome-wide paired-end massively parallel DNA sequencing to rapidly map translocation breakpoints is reported. This method was used to fine map a de novo t(5;6)(q21;q21) translocation in a child with bilateral, young-onset Wilms tumour.
Methods and results Genome-wide paired-end sequencing was performed for approximately 6 million randomly generated ∼3 kb fragments from constitutional DNA containing the translocation, and six fragments in which one end mapped to chromosome 5 and the other to chromosome 6 were identified. This mapped the translocation breakpoints to within 1.7 kb. Then, PCR assays that amplified across the rearrangement junction were designed to characterise the breakpoints at sequence-level resolution. The 6q21 breakpoint transects and truncates HACE1, an E3 ubiquitin-protein ligase that has been implicated as a somatically inactivated target in Wilms tumourigenesis. To evaluate the contribution of HACE1 to Wilms tumour predisposition, the gene was mutationally screened in 450 individuals with Wilms tumour. One child with unilateral Wilms tumour and a truncating HACE1 mutation was identified.
Conclusions These data indicate that constitutional disruption of HACE1 likely predisposes to Wilms tumour. However, HACE1 mutations are rare and therefore can only make a small contribution to Wilms tumour incidence. More broadly, this study demonstrates the utility of genome-wide paired-end sequencing in the delineation of apparently balanced chromosomal translocations, for which it is likely to become the method of choice.
- Solexa paired-end sequencing
- chromosomal translocation
- breakpoint mapping
- Wilms tumour
- clinical genetics
- molecular genetics
- paediatric oncology
Statistics from Altmetric.com
- Solexa paired-end sequencing
- chromosomal translocation
- breakpoint mapping
- Wilms tumour
- clinical genetics
- molecular genetics
- paediatric oncology
Approximately 1 in 500 individuals carry karyotypically visible, balanced chromosomal rearrangements with no apparent loss or gain of genetic material.1 One class of balanced chromosomal rearrangement is the reciprocal translocation, whereby parts of two chromosomes switch places with one another. The majority of individuals with reciprocal translocations manifest no deleterious effects of their chromosomal abnormality. However, if one of the translocation breakpoints transects (or otherwise disrupts the function of) a dominantly acting disease predisposition gene, it can result in a clinical phenotype. Such translocations can greatly facilitate the identification of a disease gene by focussing positional cloning analyses to the breakpoint regions. This strategy has proved crucial to the identification of genes for several diseases, including Duchenne muscular dystrophy, polycystic kidney disease, neurofibromatosis type 1 and Sotos syndrome.2–5
We identified a boy, designated FACT0069, with a de novo balanced translocation t(5;6)(q21;q21) who developed bilateral Wilms tumour. Wilms tumour is the most common renal tumour of childhood occurring with an incidence of 1 in 10 000 and with a median age of diagnosis between 3 and 4 years of age.6 Wilms tumours are thought to develop from abnormally persistent embryonal cells within nephrogenic rests.7 Histologically, Wilms tumour mirrors the development of the normal kidney and classically consists of three cell types; blastema, epithelia and stroma.8 Predisposition to Wilms tumour is associated with several genetic syndromes (reviewed by Scott et al9). While approximately 5% of non-syndromic Wilms tumour cases are due to mutations in WT1 or epigenetic defects at the 11p15 growth regulatory region that result in increased IGF2, the underlying cause of the majority of cases remains unknown.10 11
Several chromosomal abnormalities associated with increased risk of Wilms tumour are known, including defects at 11p13 and 11p15 that affect WT1 and IGF2, respectively, trisomy 18, trisomy 13 and 2q37 deletions.9 In addition, five, apparently balanced, constitutional chromosomal translocations have been reported in individuals with Wilms tumour (table 1).12–19 The young age of diagnosis and the bilateral nature of the Wilms tumour in the case we report here increase the likelihood that he has an underlying genetic predisposition to Wilms tumour. If so, a gene disrupted by the de novo translocation would be a strong candidate for causing his susceptibility to Wilms tumour. Given that FACT0069 has no other phenotypic abnormalities, such a gene would also be a credible candidate for causing non-syndromic, sporadic Wilms tumour cases. We therefore sought to define the breakpoints of the t(5;6)(q21;q21) translocation to investigate this further.
Traditional methods for mapping breakpoints of cytogenetic rearrangements—for example, using fluorescent in situ hybridisation or array painting—can still be challenging, laborious and provide limited resolution.20–22 As a result, many reciprocal translocations associated with abnormal phenotypes remain undefined at the molecular level. The development of second-generation sequencing technologies offers the potential to accurately delineate translocation breakpoints in an expedient and cost-effective fashion through DNA sequencing. Extraordinary progress has been made in the evolution of massively parallel approaches to generate vastly increased quantities of sequence at greater speed and reduced cost.23 Coupling these technologies to a paired-end strategy, whereby short reads are generated from both ends of longer DNA fragments, has been shown to be highly effective in the characterisation of somatically acquired rearrangements in cancers24 and germline structural variation across the human genome.25 We therefore employed genome-wide paired-end sequencing using the Illumina/Solexa platform to fine map and subsequently identify the translocation breakpoints in FACT0069.
FACT0069 clinical summary
FACT0069 was born at 39 weeks of gestation with normal weight and head circumference and no dysmorphic facial features. He had normal development throughout childhood. There is no significant family history of childhood cancer or other medical conditions. He was diagnosed as having bilateral, synchronous Wilms tumour at the age of 6 months and underwent pre-operative chemotherapy with subsequent bilateral partial nephrectomies. Tumour histology confirmed Wilms tumour of predominantly stromal type with small islands of epithelial cells and sparse, small foci of blastema with no lymphatic spread. He is now 16 years old and has remained well.
t(5;6)(q21;q21) breakpoint mapping
Using Illumina/Solexa paired-end sequencing, we generated 5.87 million paired reads, providing approximately 5.7-fold haploid physical coverage, indicating that, on average, each base in the genome was covered by 5.7 fragments from which paired-end sequences had been generated. Of these, the reads from six fragments mapped to chromosome 5 at one end and to chromosome 6 at the other end, suggesting that they cross the translocation breakpoint (table 2) (UCSC Build 36.2). Two reads were from the derivative chromosome 5 and four reads were from the derivative chromosome 6 (see table 2 and figure 1). The minimal intervals containing the breakpoints defined by these overlapping reads were both ∼1.7 kb and were between nucleotides 99971677 and 99973404 on chromosome 5 and nucleotides 105369978 and 105371680 on chromosome 6 (UCSC Build 36.2) (see figure1).
We used long-range PCR to amplify junctional fragments that cross the translocation and sequenced the resulting products to delineate the breakpoints at the nucleotide level. The chromosomal breakpoint is within a region of overlapping microhomology on the two chromosomes consisting of a tract of four adenosines (figure 2). There is no gain or loss of chromosomal material on the derivative chromosome 5, and the translocation bisects an intergenic region 21.8 kb telomeric of TMEM157 and 198.6 kb centromeric of ST8SIA4 (UCSC Build 36.2). The derivative chromosome 6 breakpoint is associated with a more complex rearrangement. It is not possible to definitively characterise the events that have generated the rearrangement. However, the simplest explanation is that the four adenosines have been deleted, together with erosion of a further seven nucleotides of chromosome 5 and ten nucleotides of chromosome 6. In their stead, there is a 22-base insertion that, taken in its entirety, does not align anywhere in the human genome (http://blast.ncbi.nlm.nih.gov/Blast.cgi). Interestingly, this inserted sequence contains elements of the sequence in the vicinity of the breakpoint. There is a seven-base-pair sequence derived from chromosome 5 and an eleven-base-pair sequence derived from chromosome 6 represented within the inserted sequence. Furthermore, these two segments of recognisable sequence overlap by two base pairs (AG) within the inserted segment (see figure 2).
The chromosome 6 breakpoint transects intron 6 of HACE1 (homologous to E6-associating protein carboxy terminus domain and ankyrin repeat-containing E3 ubiquitin-protein ligase 1) (http://www.ncbi.nlm.nih.gov GENBANK: NM_020771.3). HACE1 is a 24-exon gene that encodes a 909-codon protein. It has previously been implicated as a somatic target in Wilms tumourigenesis, adding further weight to the proposal that the translocation in FACT0069 is directly involved in causing his Wilms tumour.26
HACE1 analysis in Wilms tumour cases
To further evaluate the role of HACE1 in Wilms tumour predisposition, we successfully sequenced the coding exons and intron–exon boundaries of the gene in 421 individuals with non-syndromic Wilms tumour and undertook multiplex ligation-dependent probe amplification (MLPA) to detect exonic deletions and duplications in 94 cases. We identified one nonsense mutation, four missense variants, six synonymous variants by sequencing and no exonic deletions/duplications by MLPA (table 3). The nonsense mutation and three of the missense variants were novel and each was present in a single individual. The fourth missense variant (D399G) is common and registered on dbSNP (http://www.ncbi.nlm.nih.gov/SNP/) as are five of the six synonymous substitutions. The non-truncating variants were not predicted to be deleterious, or to induce aberrant splicing, by in silico prediction programmes. Their status is currently unclear but they are likely to be non-pathogenic variants.
The nonsense mutation, 1092 G>A, W364X is in exon 12 and is predicted to lead to premature truncation of the protein. The individual carrying the mutation, FACT1027, was diagnosed as having unilateral Wilms tumour at the age of 10.6 years. She was treated with pre-operative chemotherapy and nephrectomy. Histology confirmed Wilms tumour; approximately two thirds of the tumour was necrotic and the remainder was largely stromal with well-differentiated elements, including fat and skeletal muscle cells. Rare foci of blastema and perilobar nephrogenic rests were noted. There is no family history of childhood cancer and no other phenotypic abnormality in the proband. The truncation was maternally inherited; the mother has not had Wilms tumour.
Using genome-wide paired-end sequencing, we have established that the translocation t(5;6)(q21;q21), present in a child with bilateral, young-onset Wilms tumour, interrupts the HACE1 gene on chromosome 6q21. This region has been implicated in Wilms tumour previously as two somatic translocations involving 6q21, t(2;6)(q35;q21) and t(6;15)(q21;q21) have been reported in two separate Wilms tumours.27 28 The 6q21 breakpoint of the latter translocation was found to be 50 kb upstream of HACE1, and HACE1 expression was markedly reduced in the tumour tissue compared to normal kidney.26 Furthermore, a Wilms tumour with a somatic heterozygous interstitial 6q21 deletion, in which 7.3 Mb including HACE1 is deleted, has recently been published.29
HACE1 encodes a 103 kDa protein with six N-terminal ankyrin protein–protein interaction repeats with sequence similarity to those of INKA and a carboxy terminus homologous to E6-associating protein carboxy terminus ubiquitin-protein ligase domain.26 The functions of HACE1 have not been fully elucidated but it appears to play a role in the regulation of cell cycle progression during cellular stress by influencing cyclin D1 degradation.30 HACE1 is ubiquitously expressed in normal tissues, including the kidneys, but has reduced expression in Wilms tumours and multiple other cancers.26 30 This reduced expression in tumours appears to be primarily related to hypermethylation of a CpG island upstream of HACE1, although relatively little mutational analysis of HACE1 in cancers has been performed to date.30
In addition to the constitutional translocation inactivating HACE1 that we delineated, we identified an individual with non-syndromic Wilms tumour and a truncating HACE1 mutation. The mutation was inherited from the child's mother, who has not had Wilms tumour. This could simply reflect reduced penetrance, which is common in inherited cancer syndromes, or it could indicate that the mutation is not pathogenic and is coincidental to the Wilms tumour in this child.
Taken together, the combined constitutional and somatic data strongly suggest that abrogation of HACE1 activity can promote Wilms tumourigenesis and that constitutional disruption of the gene can predispose to Wilms tumour. However, our extensive analysis of over 400 Wilms tumour cases demonstrates that HACE1 mutations are rare and therefore can only make a small contribution to the disease incidence overall.
Traditionally, cloning translocation breakpoints has been challenging and time consuming. This is exemplified by FACT0069, which was initially reported in 1997.18 Efforts to map the breakpoints by fluorescent in situ hybridisation resulted in localisation to within a few million bases by 2003; however, the exact breakpoints were not identified.17 By contrast, using massively parallel sequencing technology, we were able to map the breakpoints to within 2000 bases in a few weeks. To our knowledge, second-generation sequencing methodology has been reported in the investigation of constitutional chromosomal translocations only once previously, by Chen and colleagues. They used a different strategy, undertaking single-end sequencing of one, flow-sorted, derivative chromosome.31 We believe the method we present here is more generally applicable as isolating rearranged chromosomes is not always possible and requires specialised expertise in flow cytometry and/or microdissection. In addition, utilising a paired-end strategy in a sample with a known translocation permits more straightforward analysis as the nature of the discordant read pairs is already known. Furthermore, the sequence data from the rest of the genome may be helpful in some instances, as apparently balanced translocations can be more complex than visible on standard karyotyping and can involve gain, loss or translocation of genetic material some distance from the breakpoints, either in cis or from other chromosomes.20 21 The genome-wide paired-end method described here can provide data to elucidate such events.
In this study, we selected 3 kb inserts and undertook sequencing to give approximately sixfold physical coverage. This was sufficient to map the translocation to a region easily tractable by long-range PCR. However, modification of the insert size and/or sequence coverage can readily be undertaken to increase resolution, if desired. As costs for new sequencing technologies continue to fall, throughput continues to rise and read lengths continue to extend, genome-wide paired-end sequencing will likely become the investigation of choice for delineating chromosomal translocations and other chromosomal structural rearrangements.
Subjects and methods
We obtained peripheral blood samples from FACT0069, who has a de novo constitutional translocation t(5;6)(q21;q21), with informed consent. This child has been previously reported.17 18 We utilised DNA from 450 individuals affected with Wilms tumour that have been recruited from Paediatric Oncology centres in the UK through two studies, the Investigation of Wilms Tumour Genes study and the Factors Associated with Childhood Tumours (FACT) study. This included samples from 25 cases with a family history of Wilms tumour and 34 individuals with bilateral Wilms tumour. All cases have been analysed for WT1 mutations and 11p15 defects. The methods for these analyses have been previously described.32 Only samples in which WT1 and 11p15 defects had been excluded were included in the current study. The research was approved by the London Multicentre Research Ethics Committee (05/MRE02/17).
Genome-wide paired-end sequencing
We randomly sheared 10 μg of total genomic DNA using the nebuliser supplied with the Genome Analyser instrument according to the manufacturer's instructions. We selected fragments of ∼3 kb size by gel electrophoresis and tagged the end of each by incorporation of biotinylated nucleotide before circularising and randomly fragmenting the DNA. We recovered the biotinylated junction fragments and ligated Illumina paired-end adaptor oligonucleotides to both ends before amplifying the ligated products using two oligonucleotide primers. We prepared the GenomeAnalyzer paired-end flow cell on the supplied cluster station according to the manufacturer's protocol. We denatured the amplified DNA fragments and annealed them to complementary oligonucleotides on a flow cell surface. We performed three lanes of paired-end sequencing generating high-density, single-molecule arrays of genomic DNA fragments. We then sequenced the clusters of PCR colonies using the GenomeAnalyzer (GA1) platform and the images were processed using the manufacturer's software. We used the alignment algorithm MAQ (Mapping and Assembly with Qualities) to align the sequence reads generated to the human genome.33 We generated a total of 5.87 million paired reads where both ends mapped to Build 36.2 of the human reference genome (http://genome.ucsc.edu/cgi-bin/hgBlat, Build 36.2). Of the total number of reads 30.7 million were excluded as they were identified as PCR duplicates, 6.6 million were excluded as only one end mapped and 7.4 million were excluded as neither end mapped. We subsequently identified all reads that included one end that mapped to chromosome 5 and the other end to chromosome 6 (see table 2). We also performed a binary, circular segmentation algorithm to detect copy number changes as previously described.24 All of the copy number variants identified in FACT0069 have also been seen in constitutional DNA from multiple other individuals and therefore represent non-pathogenic CNVs.
We designed primers (http://primer3.sourceforge/net/), for both derivative chromosomes, to encompass the region in which the breakpoints were predicted to occur from the Solexa sequencing results. We used different combinations of primer sets in long-range PCR experiments using BIO-X-ACT long DNA polymerase (Bioline, London, UK) with 2.5 mM MgCl2 and a touch-down 60–50° programme. We assessed products from each primer set using gel electrophoresis for both FACT0069 and a control and identified the smallest sized FACT0069-specific PCR product. We sequenced this fragment by capillary sequencing using the BigDye Terminator Cycle Sequencing Kit and a 3730 automated sequencer (ABI Perkin Elmer). For the derivative 5 breakpoint, we used primer pair 5F2 (5′-GCACAAAAATGTAAAACAACTGC-3′) and 6R1 (5′-GGCCCACGTAGTCCTTCTCT-3′); for the derivative 6 breakpoint, we used primer pair 5R5 (5′-TCCATGCTGCCTTTGATACA-3′) and 6F8 (5′-AGCCCCCTAAGCTCTACTCC-3′). We compared the sequence to the reference sequence (UCSC Build 36.2) using Mutation Surveyor software v3.20 to define the breakpoints at base-pair resolution.
HACE1 mutation and copy number analysis
HACE1 cDNA Genbank accession number NM_020771.3; HACE1 protein Genbank accession number NP_065822 (http://www.ncbi.nlm.nih.gov).
HACE1 contains 24 exons. We designed PCR primers to amplify all exons and intron–exon boundaries (supplementary table 1). We used Mutation Surveyor to analyse the exonic sequence, the intron–exon boundaries and 10 bases of flanking intron for all 24 exons. We confirmed identified variants in native genomic DNA. We only included the 421 samples in which 85% or more of the gene was successfully sequenced in subsequent analyses. We evaluated the likely pathogenicity of variants using Polyphen (http://genetics.bwh.harvard.edu/pph/), SIFT (http://blocks.fhcrc.org/sift/SIFT.html) and NNSplice software (http://www.fruitfly.org/seq_tools/splice.html).
Copy number analysis
We used MLPA to identify large deletions or duplications of HACE1. We designed 11 synthetic probe pairs to evenly cover the genomic footprint of HACE1, targeting exons 3, 5, 6, 7, 10, 12, 16, 19, 20, 21 and 24 (supplementary table 2). We added the synthetic probe oligonucleotides to the SALSA MLPA buffer/MLPA control probe mix P200 (MRC_Holland, Amsterdam, The Netherlands) and performed the MLPA reactions using 150 ng genomic DNA, as previously described.34 We ran the resulting MLPA PCR products on an ABI 3130 automated sequencer and analysed the results using Genemarker software v1.51.
We thank the individuals and families involved in the research, and the physicians, nurses and pathologists who referred families and provided samples. The individuals that collected samples are listed in the Appendix. We thank Nikki Huxter, Margaret Warren-Perry, Darshna Dudakia, Polly Gibbs and Jessie Bull for coordination of recruitment of cases. We thank Katrina Spanova and Bernadette Ebbs for running the ABI sequencers. We thank Ann Strydom for assisting with the preparation of the manuscript. The research was carried out as part of the investigation of the Factors Associated with Childhood Tumours (FACT) study, which is a UK Children's Cancer and Leukaemia Group study. The Childhood Cancer Research Group receives funding from the Department of Health and the Scottish Ministers. The views expressed in this publication are those of the authors and not necessarily those of the Department of Health and the Scottish Ministers. IS is supported by the Michael and Betty Kadoorie Cancer Genetics Research Programme. We acknowledge National Health Service funding to the National Institute for Health Research Biomedical Research Centre. This work was supported by Cancer Research UK (grants C8620_A9024 and C8620_A8857) and the Institute of Cancer Research (UK).
The following members of the FACT collaborations provided familial Wilms samples that were utilised in this study:
L Arbour, T Cole, E Sheridan, H Price, V Schumacher, A Weirich, B Royer-Pokora, J Kingston, A O'Meara, A Foot, B Pizer, C Dhooge, M Gerrard, W Dupuis, G Levitt, A Chompret, C Bonaitie-Pellie, P Tonin, J Skeen, J Kohler and A Gnekow.
The sporadic Wilms tumour case series were recruited from the following centres. The lead co-ordinators for each centre are listed but we gratefully acknowledge all the clinical professionals involved in case recruitment at each centre:
Aberdeen (M Connon), Birmingham (J Cooper and B Morland), Bristol (S Peters, R Elson and M Stevens), Cambridge (J Tunnacliffe and A Burke), Cardiff (J Powell and H Traunecker), Dublin (C Rooney, A O'Meara, M Capra and J Pears), Glasgow (W Taylor and E Simpson), Great Ormond Street Hospital (K Howe and G Levitt), Leeds (U Reid and A Glaser), Liverpool (S Hemsworth and H McDowell), Manchester (L Auld, C Beane and J Birch), The Royal Marsden Hospital (R Browning and K Pritchard-Jones), Newcastle (L Price and J Hale), Nottingham (J Evans, L Whiles and D Walker), Oxford (J Coaker, K Ashton and C Mitchell), Southampton (J Grout and M Radford) and Sheffield (M Gerrard).
Funding Michael and Betty Kadoorie Cancer Genetics Research Programme; The Kadoorie Charitable Foundation, Room 102, St George's Building, 2 Ice House Street, Hong Kong, China. Cancer Research UK, PO Box 123, London WC2A 3PX, UK.
Competing interests None.
Ethics approval This study was conducted with the approval of the FACT Study MREC Approval no:05/MRE02/17.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.