Article Text
Abstract
Background In 30–50% of patients with colorectal adenomatous polyposis, no germline mutation in the known genes APC, causing familial adenomatous polyposis, MUTYH, causing MUTYH-associated polyposis, or POLE or POLD1, causing polymerase-proofreading-associated polyposis can be identified, although a hereditary aetiology is likely. This study aimed to explore the impact of APC mutational mosaicism in unexplained polyposis.
Methods To comprehensively screen for somatic low-level APC mosaicism, high-coverage next-generation sequencing of the APC gene was performed using DNA from leucocytes and a total of 53 colorectal tumours from 20 unrelated patients with unexplained sporadic adenomatous polyposis. APC mosaicism was assumed if the same loss-of-function APC mutation was present in ≥2 anatomically separated colorectal adenomas/carcinomas per patient. All mutations were validated using diverse methods.
Results In 25% (5/20) of patients, somatic mosaicism of a pathogenic APC mutation was identified as underlying cause of the disease. In 2/5 cases, the mosaic level in leucocyte DNA was slightly below the sensitivity threshold of Sanger sequencing; while in 3/5 cases, the allelic fraction was either very low (0.1–1%) or no mutations were detectable. The majority of mosaic mutations were located outside the somatic mutation cluster region of the gene.
Conclusions The present data indicate a high prevalence of pathogenic mosaic APC mutations below the detection thresholds of routine diagnostics in adenomatous polyposis, even if high-coverage sequencing of leucocyte DNA alone is taken into account. This has important implications for both routine work-up and strategies to identify new causative genes in this patient group.
- Cancer: colon
Statistics from Altmetric.com
Introduction
Adenomatous polyposis syndromes of the colorectum are precancerous conditions characterised by the presence of dozens to thousands of adenomatous polyps, which, unless detected early and resected, invariably result in colorectal cancer. The phenotypic spectrum ranges from an early-onset manifestation with high numbers of adenomas and a positive family history to isolated late-onset disease with low polyp burden.
To date, three inherited monogenic forms can be delineated by molecular genetic analyses: (1) autosomal-dominant familial adenomatous polyposis (FAP), caused by heterozygous germline mutations (http://www.lovd.nl/APC; http://www.umd.be/APC; http://www.hgmd.cf.ac.uk) in the tumour suppressor gene APC;1 (2) autosomal-recessive MUTYH-associated polyposis, caused by biallelic germline mutations (http://www.lovd.nl/MUTYH; http://www.umd.be/MUTYH; http://www.hgmd.cf.ac.uk) of the base excision repair gene MUTYH;2 and (3) autosomal-dominant polymerase-proofreading-associated polyposis (PPAP), caused by specific germline missense mutations in the polymerase genes POLE and POLD1.3 Mutation detection rates are strongly dependent on the colorectal phenotype. In classical FAP, a pathogenic APC germline mutation is identified in up to 90% of index patients,4 ,5 whereas APC or MUTYH germline mutations are detected in only 20–30% of index cases with a mild disease course (attenuated FAP).6–8 A PPAP has been reported in up to 7% of families with unexplained multiple colorectal adenomas and carcinomas.9
In up to 50% of polyposis patients, no underlying germline mutation is identified, although a hereditary basis is likely. A small fraction of cases might be explained by deep intronic APC mutations,10 rare missense mutations of the APC gene11 or mutations in other known cancer predisposition genes.12–15 A recent genome-wide analysis of germline copy number variants in 221 patients with unexplained adenomatous polyposis identified a group of genes that are likely to predispose to colorectal adenoma formation.16
It is well known that the impact of low-level mosaicism in hereditary (tumour) syndromes with a high de novo mutation rate such as APC-related FAP is underestimated.17–21 Several casuistic reports of somatic and gonadal APC mosaicism have been published,22–25 one patient was identified by next-generation sequencing (NGS).26 However, only two studies have addressed the issue of mosaicism in a comprehensive manner.27 ,28 These studies detected low-level mutational APC mosaicism in 10–20% of leucocyte DNA among a selected group of unrelated patients with suspected or confirmed APC de novo mutation. Prescreening methods such as denaturing high-performance liquid chromatography (DHPLC) or protein truncation test (PTT) were shown to be more sensitive in terms of uncovering mosaicism than Sanger sequencing. This suggests that a number of mosaic cases are likely to be overlooked during current routine diagnostics.22 ,27
To evaluate the impact of low-level APC mutational mosaicism below the detection threshold of Sanger sequencing (10–15% mutated alleles) including mosaicism not visible in tissues originating from the mesodermal germinal sheet such as leucocytes at all, we have chosen a comprehensive approach that took advantage of the more sensitive NGS method29 ,30 and have performed systematic high-coverage sequencing of the APC gene in DNA from leucocytes and multiple colorectal tumours (adenomas or carcinomas) in each of 20 unrelated patients with unexplained sporadic adenomatous polyposis.
Materials and methods
Patients/data collection
All patients in the present study had unexplained colorectal adenomatous polyposis, that is, no germline mutation in the APC or MUTYH genes was identified by Sanger sequencing of the coding regions, deletion/duplication analysis using multiplex ligation-dependent probe amplification (MPA), and screening for pathogenic deep intronic APC mutations.10 ,31 ,32 Furthermore, neither of the two hotspot mutations in POLE and POLD1 was present.9
The inclusion criteria for all patients enrolled in this study were the presence of at least 20 synchronous or 40 metachronous, histologically confirmed colorectal adenomas. All patients were of central European origin according to family name and self-reporting.
For high-throughput sequencing of leucocyte-derived and adenoma-derived DNA, 20 unrelated patients with an inconspicuous family history were chosen for whom at least two colorectal tumour samples (adenomas or carcinomas) were available, which had to be separated anatomically by a distance of at least 10 cm. For screening of APC mosaicism in leucocyte-derived NGS data, 80 additional unselected index patients with sporadic disease were used (same inclusion criteria). The included patients were recruited in the period from 1992 to 2011.
DNA extraction
Genomic DNA was extracted from peripheral EDTA-anticoagulated blood samples using the standard salting-out procedure. DNA of colorectal formalin-fixed and paraffin-embedded (FFPE) tumour tissue was extracted from punches after careful histological re-evaluation of representative sections, as described elsewhere.33 Alternatively, 10-µm-thick sections were cut from FFPE tissue blocks. After deparaffinisation, tumour tissue was macrodissected from unstained slides. A previously marked H&E-stained slide served as a reference. Extraction of FFPE embedded tissue DNA was carried out using the BioRobot M48 Robotic Workstation and the corresponding MagAttract DNA Mini M48 Kit (Qiagen, Hilden, Germany), in accordance with the manufacturer's protocol. The tissues were lysed overnight with proteinase K, and the DNA was eluted in 150 mL Tris-HCl (pH 7.6).
High-throughput sequencing and bioinformatics workflow
In 9 of the 20 polyp-screened patients, exome sequencing of leucocyte and adenoma DNA was performed at the Max Planck Institute for Molecular Genetics, Berlin, Germany. Library preparation and whole-exome target enrichment was performed according to Agilent's SureSelect protocol (Human All Exon 50Mb v2, 2011) and as previously described.9 ,34 ,35 Multiplexed paired end sequencing was performed on the Illumina HiSeq 2000 platform in accordance with the manufacture's protocol. Base calling and demultiplexing were performed using Illumina's CASAVA pipeline v1.7. Raw reads were mapped to GRCh37/hg19 using BWA v0.5.836 and default parameters. Enrichment statistics were calculated using Agilent's SureSelect target regions. Local realignment, quality value recalibration and variant calling were performed using GATK v2.1–8.37 In-house tools and ANNOVAR38 were used to annotate and filter the variants. Picard (http://broadinstitute.github.io/picard/) was used to collect metrics. For leucocyte exome sequencing, the mean on-target coverage of mapped reads was 71×, and 85% of bases were covered at ≥10× (see online supplementary table S1). On average, 22 148 genetic variants per patient were identified in the coding regions or the canonical splice sites. In the remaining 11 patients, sequencing of colorectal tumour DNA samples was performed at first using the TruSight Cancer Panel (Illumina).
Whole-exome sequencing of the leucocyte screening cohort (n=80) was performed at the Yale Center for Genome Analysis, New Haven, USA, via capture using NimbleGen 2.1M human exome array followed by paired-end sequencing on a HiSeq 2000 instrument, as previously described.9 ,34 Targeted bases were covered by a mean of 67 independent reads, with an average of 94% of all bases covered eight or more times (see online supplementary table S1). Reads were aligned to the hg19 human reference genome using ELAND (Illumina). Single-nucleotide variants along with short insertions and deletions were identified using SAMtools software.
For validation of suspected APC mosaic mutations, targeted sequencing using the FAP MASTR Kit (Multiplicom) with high coverage (read depths >1000) for leucocyte DNA was performed on a MiSeq platform (Illumina). In addition, the identified variants were validated by Sanger sequencing of the corresponding region using standard protocols (primer sequences available upon request).
NGS data were filtered for truncating APC variants (nonsense mutations, frameshift deletions/insertions and mutations at highly conserved splice sites) present in at least 5% of reads using Cartagenia Bench Lab NGS v4.0 (Leuven, Belgium) or the SeqPilot software (JSI Medical Systems). Detailed visual inspection of variants was done in a read browser (Integrative Genomics Viewer).
Results
The majority of the patients presented with an attenuated colorectal phenotype (late-onset disease and/or <100 colorectal adenomas). The mean age at diagnosis was 45 years (range 14–73 years). None of the patients had a conspicuous family history (sporadic or isolated cases). The family history was classified as inconspicuous if there was no evidence of polyposis or any other early-onset FAP-related tumour in the first-degree or second-degree relatives of the proband. The basic clinical features are summarised in online supplementary table S2.
To screen for APC mosaicism, a systematic search was conducted for the presence of loss-of-function APC mutations in multiple colorectal tumours from individual patients. For this approach, 53 colorectal tumours (51 adenomas, 2 carcinomas) from 20 unrelated patients were analysed. Mutational APC mosaicism was assumed if the same pathogenic loss-of-function mutation was present and validated in more than one tumour (workflow shown in figure 1).
In 5 of the 20 polyp-screened patients, APC mosaicism could be detected (table 1). In patient F5018, the same nonsense APC mutation (c.1660C>T;p.Arg554*) was recognised in leucocyte DNA (11% of the reads=allelic fraction; read depth 286×) and two adenomas (allelic fraction 25% (4/16) and 60% (3/5), respectively) (figure 2A and table 1). To confirm the degree of mosaicism in leucocyte-derived DNA, targeted high-throughput sequencing of the APC gene with high coverage was performed using a commercial assay (Multiplicom). Again, a level of around 10% mutated reads was found (figure 2B). Sanger sequencing revealed only a very discrete peak (figure 2C). In one of the adenomas, a further APC nonsense mutation (c.4348C>T;p.Arg1450*) was detected in 33% (4/12) of reads, which indicate a possible somatic mutation of the wildtype allele (“second hit”) (see online supplementary figure S1).
In patient F1243, the same nonsense APC mutation (c.3283C>T;p.Gln1095*) was found in both adenomas and an allelic fraction of 1% in leucocyte DNA (table 1 and online supplementary figure S2). In patient F1676, an identical frameshift APC mutation (c.2840_2841delGT;p.Cys947Phefs*15) was identified in both adenomas and with an allelic fraction of 9% in leucocyte DNA. Sanger sequencing revealed only very discrete peaks representing the frameshift (table 1 and online supplementary figure S3). In one of the adenomas, a potential second hit (c.4348C>T;p.Arg1450*) was detected in 18% (6/34) of reads. In patient F5000, an identical frameshift APC mutation (c.4127_4128delAT;p.Tyr1376Cysfs*9) was found in all four polyps but not in leucocyte DNA (table 1 and figure 3). In all patients, the suspected mosaic mutations could be revealed in additional polyp samples by Sanger sequencing (figure 2D and online supplementary table S3, figure S2C and 3D). Using targeted high-throughput sequencing of the APC gene (see above), the levels of mutated reads were confirmed in leucocyte-derived DNA with high coverage in patients F1243 and F1676 (table 1 and online supplementary figure S2B and 3C). In patient F5000, the APC mutation, found in seven adenomas, was not detectable in leucocyte-derived DNA in spite of a coverage of 2650×.
Furthermore, a nonsense APC mutation (c.1495C>T;p.Arg499*), detected in one adenoma by targeted NGS, could be confirmed in two of three additional polyps by Sanger sequencing (patient F1543, table 1 and online supplementary figure S4). In leucocyte DNA, the mutation was only detectable at a very low-level with high-coverage NGS (allelic fraction 0.1%, 22/17 460 reads, table 1).
To explore the sensitivity of high-coverage leucocyte-derived NGS alone to uncover low-level APC mosaicism, we performed a systematic screen of exome sequencing data in 80 additional patients (figure 1). In 12 patients, APC mosaicism was suspected due to the detection of a truncating APC mutation in at least 5% of reads. From 10 of those patients, we were able to receive at least two colorectal tumours for validation by Sanger sequencing (29 adenomas, 1 carcinoma). In two of these 10 patients, the mutation (detected in 8–9% of reads in leucocyte DNA) could be confirmed in 2/3 and 4/4 polyps, respectively (F1252 and F1727, table 1, online supplementary figures S5 and 6). On leucocyte level, the mutations could be confirmed by targeted sequencing with high coverage.
The positions of all seven mosaic mutations in relation to the described somatic mutations and the mutation cluster region (MCR) of the APC gene are shown in online supplementary figure S7.
All patients with APC mosaicism had an attenuated polyposis phenotype; clinical details are provided in table 2. At the time of the last contact (telephone interview), only one child had a colonoscopy with normal findings at 38 years of age (the age of all 12 children of the mosaic index cases ranges between 6 and 40 years, median 26 years), no symptoms such as gastrointestinal bleedings were reported.
Discussion
In a number of patients with colorectal adenomatous polyposis, no germline mutation in the known causal genes can be identified. Based on previous data from our group and others, it can be hypothesised that a substantial fraction of mosaic cases are overlooked in routine diagnostics using Sanger sequencing of leucocyte DNA.22 ,27 To elucidate the frequency of undiscovered APC mosaicism in sporadic polyposis patients, we systematically screened multiple adenomas from a number of patients with unexplained polyposis using NGS.
In the previous studies performed by our group27 and Hes et al,28 APC mutational mosaicism was detected in 11% and 20% in a selected group of unrelated patients with suspected or confirmed de novo APC mutation, respectively. In both studies, a combination of sensitive methods, that is, PTT, DHPLC, denaturing gradient gel electrophoresis (DGGE) and Sanger sequencing was applied.
However, related to the number of APC and MUTYH mutation-negative index patients included in these study cohorts (450 in our previous study and 295 in the study of Hes et al), the prevalence of uncovered mosaicism is much less. In our current, well-characterised overall cohort of 261 unrelated patients with unexplained adenomatous polyposis, approximately 80% of patients presented with apparent sporadic disease. When applying this percentage to the mutation negative cohorts of the above-mentioned studies, the frequency of somatic APC mosaicism in sporadic unexplained adenomatous polyposis can be estimated to be around 2–4%. These numbers are comparable with our present finding in 80 unselected, sporadic polyposis patients: based on a systematic screening in NGS data of leucocyte-derived DNA and subsequent sequencing of the respective APC exons in adenoma DNA, low-level APC mosaicism was detected in two cases (2.5%).
In contrast, the multiple adenoma approach identified somatic mosaicism for a pathogenic APC mutation in 25% (5/20) of patients with sporadic disease in whom no APC and MUTYH mutations had been identified in routine diagnostics. According to this result, pathogenic somatic APC mosaicism is considerably higher than previous data might have been suggested.
All mosaic mutations were validated by two different methods excluding allele drop-out and technical artefacts. Each tumour-derived DNA sample originated from a single independent adenoma. At least two adenomas with the same APC mutation were separated anatomically by a distance of >10 cm, the majority even grew in different colonic segments (table 1). None of the mosaic patients reported a conspicuous family history regarding siblings, parents or second-degree ancestors, which is consistent with the assumption of a de novo event.
Interestingly, six of the seven mosaic mutations (five from polyp-screened and two from leucocyte-screened patients) were located outside the somatic MCR (codons 1286–1513) of the APC gene.39 In contrast, the two presumed second hits, both of which were identified in a single adenoma (F1676 and F5018), affect the same codon (c.4348C>T;p.Arg1450*), which represents a mutational hotspot within the MCR (see online supplementary figure S7).
In 2/5 cases, the mutation level in leucocyte-derived DNA was slightly below the detection threshold of Sanger sequencing (10–15% mutated alleles), which explains why the mutation was not found during routine diagnostics. Apart from NGS, these cases might have been identified with other sensitive methods such as PTT, DHPLC, DGGE, high-resolution melting or pyrosequencing.27 ,28 ,40 However, the use of these techniques is decreasing in clinical practice and even in those cases, additional non-mesoderm tissue is usually required to confirm mosaicism.
In 3/5 cases, the mutation level (allelic fraction) in leucocyte-derived DNA was very low (0.1–1%) or the mutation was not detectable at all despite deep sequencing. It can be presumed that in cases like patient F5000 an early postzygotic mutation must have occurred after mesoderm and endoderm specification.17 In all three patients, the mutation was present in at least three colorectal adenomas, a finding that is strongly suggestive of mosaicism. These mosaic mutations are undetectable even in high-coverage NGS data from leucocyte-derived DNA alone since screening for mosaic mutations below a level of 5% mutated reads reveals too much false positive results and is thus not feasible in clinical practice. This is also demonstrated in our screening cohort of 80 patients, where 8/10 potential low-level mosaic mutations could not be confirmed by examination of tumour-derived DNA. As a consequence, such mutations, which represent a magnitude of around 60% of mosaic cases in our cohort of 20 patients, can only be uncovered by screening of additional tissues and usually adenomas are easiest to get and the most promising tissue to identify mosaic mutations.
Hence, our data confirm theoretical considerations of a high frequency of somatic mosaicism not detectable in leucocyte DNA. Although a small number of such cases have been reported in recent years by Hes et al,28 ,40 to date, no systematic investigation of the clinical relevance of APC mosaicism has been conducted. Given the number of cases in this study, the results are likely to be representative. The inclusion of multiple adenomas into the routine workup is a promising way to identify those cases and to increase the diagnostic sensitivity considerably.
While somatic mosaicism can contribute to deviations from the predicted phenotype, the level of mosaicism in leucocytes shows no consistent correlation with disease severity. In the present seven patients with somatic mosaicism, the polyps were not restricted to a particular colonic segment; in fact, in two cases even the upper gastrointestinal tract was affected. Nonetheless, all of our mosaic cases presented with an attenuated colorectal disease according to age at onset and/or polyp burden, despite the fact that the position of the mutation in the APC gene would have been expected to result in classical (typical) FAP. As a consequence, the offspring of mosaic patients may develop a more severe phenotype than the affected parent. This is relevant for genetic counselling and the decision as to when surveillance should commence.
The diagnosis of mosaicism is also important in terms of estimating the recurrence risk in relatives. While siblings and parents will not be affected, the risk for children is up to 50%, depending on the distribution of the mutation in different tissues. Patients in whom mosaicism is restricted to the endoderm are thought to have a low risk of transmitting the mutation since the primordial germ cells should not be affected.17 A yet unknown polyposis in children of our index cases cannot be excluded for sure, although there is no evidence for this so far.
To the best of our knowledge, this is the first study that systematically evaluates the impact of APC mosaicism using multiple tumour samples in a sufficient number of patients with unexplained polyposis. In conclusion, this work suggests that low-level APC mosaic mutations below the threshold of routine diagnostics contribute significantly to the aetiology of adenomatous polyposis. While the frequency of detectable somatic APC mosaicism in leucocyte-derived DNA alone may not exceed 3–4%, even when more sensitive methods are applied, a multiple adenoma approach can increase the diagnostic yield to 20–30%. Our findings may have important implications for both routine diagnostics and research strategies. The inclusion of APC mutation screening in multiple (≥2) colorectal adenomas can clarify the diagnosis in up to one-quarter of mutation-negative patients, and thus should be considered in all cases with proven (>20 adenomas) and sporadic disease (inconspicuous family history) in whom no pathogenic APC, MUTYH, POLE or POLD1 mutation is detected. It also might be reasonable to carefully exclude APC mutational mosaicism prior to admittance of this patient group in studies that aim to identify new causative highly penetrant genes.
Databases/URLs
APC locus-specific mutation databases: http://www.lovd.nl/APC; http://www.umd.be/APC; http://www.hgmd.cf.ac.uk
IGV (Integrative Genomics Viewer): http://www.broadinstitute.org/igv/
MUTYH locus-specific mutation databases: http://www.lovd.nl/MUTYH; http://www.umd.be/MUTYH; http://www.hgmd.cf.ac.uk
Primer3 v.0.4.0: http://frodo.wi.mit.edu/primer3/input.htm
Acknowledgments
The authors thank the patients and their families for participating in the study. They are grateful to Siegfried Uhlhaas and Dietlinde Stienen for their excellent technical support.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
- Data supplement 1 - Online supplement
Footnotes
Contributors IS, DD and MK contributed equally. Study design: SA, IS and MRS. NGS experiments: DD, TB, MK, BT, MRS, BZ and RPL. NGS analysis: DD, MK, IS, SHor and SPet. Sanger sequencing: IS. Bioinformatics input: DD, TB, MK and BZ. Patient recruitment, collection of clinical and familial data: IS, SHol, RA, AL and EH-F. Analysis of clinical data: IS, SHol and RA. DNA extraction out of FFPE tumour tissue: JK, SPer, GK and MRS. Literature search and figures: IS and SA. Preparing manuscript: IS and SA. Critical review of the paper: PH, SHol, SHor, RA, SPet, EH-F, DD, MK, MRS and MMN. Supervision and expert advice: SA, PH, MRS and MMN. IS had full access to all study data and had final responsibility for the decision to submit the manuscript for publication.
Funding This work was supported by the German Cancer Aid (Deutsche Krebshilfe e.V. Bonn, Grant number 108421); the Gerok-Stipendium of the University Hospital Bonn (grant no. O-149.0098); the NIH Centers for Mendelian Genomics (5U54HG006504); the Federal Ministry of Education and Research (0316190A); and the Volkswagenstiftung (Lichtenberg Program to MRS).
Competing interests None declared.
Patient consent Obtained.
Ethics approval Medical Faculty of the University of Bonn, ethics review board no. 224/07.
Provenance and peer review Not commissioned; externally peer reviewed.