Background Inherited mutations in DNA mismatch repair genes predispose to different cancer syndromes depending on whether they are mono-allelic or bi-allelic. This supports a causal relationship between expression level in the germline and phenotype variation. As a model to study this relationship, our study aimed to define the pathogenic characteristics of a recurrent homozygous coding variant in PMS2 displaying an attenuated phenotype identified by clinical genetic testing in seven Inuit families from Northern Quebec.
Methods Pathogenic characteristics of the PMS2 mutation NM_000535.5:c.2002A>G were studied using genotype–phenotype correlation, single-molecule expression detection and single genome microsatellite instability analysis.
Results This PMS2 mutation generates a de novo splice site that competes with the authentic site. In homozygotes, expression of the full-length protein is reduced to a level barely detectable by conventional diagnostics. Median age at primary cancer diagnosis is 22 years among 13 NM_000535.5:c.2002A>G homozygotes, versus 8 years in individuals carrying bi-allelic truncating mutations. Residual expression of full-length PMS2 transcript was detected in normal tissues from homozygotes with cancers in their 20s.
Conclusions Our genotype–phenotype study of c.2002A>G illustrates that an extremely low level of PMS2 expression likely delays cancer onset, a feature that could be exploited in cancer preventive intervention.
- constitutional mismatch repair deficiency (CMMRD)
- tumor suppression
- gene expression
- splice site
Statistics from Altmetric.com
- constitutional mismatch repair deficiency (CMMRD)
- tumor suppression
- gene expression
- splice site
Germline mutations in DNA mismatch repair (MMR) genes, MLH1, MSH2, PMS2 and MSH6 predispose to inherited cancer syndromes. Mono-allelic mutations lead to Lynch syndrome, also known as hereditary non-polyposis colorectal cancer (HNPCC, MIM #120435),1 while bi-allelic mutations predispose to constitutive mismatch repair deficiency (CMMRD, MIM #276300).2 Typical clinical manifestations of Lynch syndrome include adult-onset colorectal and endometrial cancers as well as cancers occurring in the small intestine, urothelial tract, brain and ovary.3 In contrast, CMMRD displays a more severe phenotype, with childhood onset of leukaemia/lymphoma, brain tumours, colorectal/gastrointestinal cancers and other rare malignancies, such as rhabdomyosarcoma.4 MMR genes are tumour suppressors; the majority of inherited pathogenic mutations introduce premature stop codons resulting in the loss of protein function.5 ,6 Lack of expression results in MMR deficiency, of which microsatellite instability (MSI) is a hallmark feature. MSI is present at very low levels in lymphocytes and other normal tissues from individuals with mono-allelic MMR mutations,7 and MSI levels are higher and readily detectable in individuals with bi-allelic mutations.8 ,9
The PMS2 founder mutation reported in this study appears to cause a cancer phenotype atypical of either Lynch syndrome or CMMRD. NM_000535.5:c.2002A>G, referred to as c.2002A>G for simplicity, was first identified in an Inuit family from Puvirnituq, Nunavik (Quebec) with cancers diagnosed in four siblings and where the pedigree structure was suggestive of a recessive inheritance pattern. Patients fulfilled the clinical criteria for CMMRD (see online supplementary table S1).2 Immunohistochemistry of the proband and affected relatives corroborated this assessment by demonstrating normal expression of MLH1, MSH2 and MSH6, but complete loss of PMS2, both in tumour cells and in adjacent normal tissue (see online supplementary figure S1A). Genomic DNA sequencing guided by protein truncation tests (PTTs) identified a missense variant in PMS2, c.2002A>G, as the causative mutation. This coding variant causes the substitution of isoleucine by valine at codon 668 (NP_000526, PMS2 p.I668V), which is predicted to be functionally neutral according to multiple prediction algorithms10 (see online supplementary methods). However, lymphocyte cDNA sequencing from the index patient revealed a 5 bp deletion at the exon 11–12 junction (see online supplementary figure S1B), generating a premature stop codon, p.I668*, as a result of aberrant RNA splicing that is predicted to lead to nonsense-mediated decay (see online supplementary figure S1C).
Subsequently, we identified nine additional individuals homozygous for c.2002A>G from six unrelated families, all of Inuit origin (see online supplementary figure S2A). Details about patient recruitment are provided in online supplementary methods. Thirty-eight heterozygotes have been identified, making this the single most common PMS2 mutation reported until now worldwide. To track the origin of the mutation, we genotyped 17 short tandem repeat markers (primer sequences listed in online supplementary table S2) for families where DNA was available from both heterozygous and homozygous members, and the result suggests the mutation was inherited from a common ancestor (see online supplementary figure S2B).
Among 13 individuals homozygous for c.2002A>G, two developed colorectal polyps and the rest were diagnosed with cancer before the age of 40 (clinical manifestations summarised in online supplementary table S3). We observed that the age at cancer onset among individuals homozygous for c.2002A>G was noticeably later than for those carrying homozygous nonsense mutations in gDNA. We investigated this using a phenotype comparison of PMS2 mutations with positions matched to c.2002A>G. We catalogued and compared the phenotype for patients with germline PMS2 mutations exclusively in exon 11 by classifying the genotypes into three groups according to expressivity. Group I carry bi-allelic truncating mutations without the expression of full-length PMS2 protein; Group II are homozygous for c.2002A>G and Group III carry mono-allelic truncating mutations with the expression of one wild-type allele (figure 1). The genotype–phenotype visualisation revealed a clear trend in age at primary cancer onset across the three groups: childhood for Group I (median=9 years, range=1–16 years), early adulthood for Group II (median=22 years, range=3–39 years) and middle age for Group III (median=49 years, range=36–77 years). The difference between groups is statistically significant (p<0.001, Kruskal–Wallis test for three-group comparisons and Mann–Whitney U test for two-group comparisons), supporting our hypothesis that the 13 individuals homozygous for c.2002A>G display a phenotype atypical of CMMRD or Lynch syndrome. This observation holds true if we extend the expressivity-guided genotype–phenotype analysis to mutations scattered across the entire PMS2 locus (see online supplementary figure S3). Of note, the tumour spectrum of c.2002A>G homozygotes appears shifted when compared with CMMRD patients with PMS2 mutations. Specifically, brain tumours were less prevalent in c.2002A>G homozygotes than in carriers of bi-allelic truncating mutations (15% vs 67%, p=0.001).
The c.2002A>G mutation creates a de novo 5′ splicing site (5′ss) for intron 11. Utilisation of this novel 5′ss results in a frameshift in the mRNA. The majority of 5′ss are recognised via base pairing with the 5′ end of the U1 small nuclear RNA at the initial stage of pre-mRNA splicing.11 5′ss in humans conform to the consensus sequence ‘MAG|GTRAGT’, where M and R are degenerative positions with A/C most frequent at M and A/G at R.12 The DNA sequence at the boundary between exon 11 and intron 11 of PMS2 is particular in that c.2002A>G results in two partially overlapping 5′ss: the mutant (de novo) site ‘GAG|GTAAGG’ and the authentic site ‘AAG|GTAAAG’. According to prediction algorithms, the splicing score of the de novo site is slightly higher than the authentic site, though neither site matches perfectly to the consensus (see online supplementary table S4). This raised the possibility that both 5′ss are used during pre-mRNA splicing.
Only one transcript population, the aberrant transcript with a 5 bp deletion, was detected by Sanger sequencing of patient cDNA. However, Sanger sequencing is based on population PCR in which templates of low abundance can be missed because of low amplification efficiency. The Polymerase Colony (Polony) assay is a single molecule-based approach suitable for detecting and quantifying rare transcripts.13 ,14 We performed this assay on a 960 bp amplicon encompassing the exon 11–12 junction using cDNA from peripheral lymphocytes of individual III-2 from the proband's family (homozygous for c.2002A>G) (see online supplementary methods and figure 2A) and observed three transcript populations: aberrant transcripts with 5 bps deleted at the exon 11–12 junction, transcripts from a pseudogene locus (PMS2CL) and a minor amount of full-length transcripts from the functional PMS2 gene (figure 2B). Thus, results from the sensitive Polony assay indicated that both juxtaposed 5′ss are used during pre-mRNA splicing.
Based on these results, we designed a molecule-specific PCR to validate the dual utilisation of 5′ss in c.2002A>G homozygotes (see online supplementary methods and figure 2C and see online supplementary figure S4A, B). The existence of a pseudo-transcript PMS2CL (>1 kb, containing PMS2 exons 9, 11–15) that highly resembles the PMS2 transcript at the sequence level15 made it technically unsuitable to quantify the intact/aberrant exon 11–12 junctions using real-time PCR. However, at least 10 more PCR cycles were needed to amplify the intact transcript to detectable levels than were needed for the aberrant transcript using constant settings in semiquantitative fragment analysis, suggesting the abundance of the two populations differs by an order of 210. Using this molecule-specific PCR, we assessed the expression of the intact exon 11–12 junction in additional c.2002A>G homozygotes diagnosed with cancers in their 20s. The intact transcripts were detected in all biospecimens available for laboratory investigation: peripheral lymphocytes (four patients), primary fibroblasts (two patients) and a normal colon mucosa (one patient; see online supplementary figure S4C).
The intact transcript was successfully translated into a peptide by in vitro protein translation (PTT) (figure 2D). Full-length PMS2 protein was also detected in lymphoblastoid cells (LCLs) and fibroblasts derived from two patients from unrelated families who were homozygous for c.2002A>G and who had cancers diagnosed at ages 21 and 26, respectively (see online supplementary figure S5). This is functionally relevant because the intact PMS2 protein, albeit at extremely low abundance, was found in association with its functional partner MLH1 (figure 2E). The MLH1–PMS2 heterodimer is an essential component of the large protein complex present at DNA mismatch sites to remove the mismatched base, then repairs the damage.16 The PMS2 protein encoded by the aberrant transcript, if produced, would be missing the carboxyl terminus, causing the loss of heterodimerisation to MLH1. Attempts to detect this truncated PMS2 peptide with antibodies against its N-terminus in homozygous c.2002A>G LCLs were unsuccessful, possibly due to nonsense-mediated decay of the transcript or instability of the peptide.
Combined cDNA analysis, in vitro peptide translation and protein detection in specimens derived from patients all pointed to a mechanism where residual expressivity underlies the attenuated CMMRD phenotype associated with homozygous status of c.2002A>G. To test this interpretation from a different angle, we measured MSI levels using the tetranucleotide marker D17S1307 in normal tissues to investigate the correlation between residual PMS2 expressivity and hypermutability, a hallmark molecular phenotype of CMMRD. Peripheral lymphocytes and colon mucosa were available from two CMMRD patients, one homozygous for c.2002A>G and the second a compound heterozygote for the truncating mutations c.1221delG and c.2361delCTTC.17 Prime (major) alleles of D17S1307 in each tissue were determined by conventional genotyping with 0.1 ng DNA. Variant (rare) alleles that arose in phenotypically normal cells were subsequently detected in genotyping reactions using only 10 pg DNA (equivalent to 3 alleles, 1.5 diploid genomes) per reaction to prevent skewed amplification towards abundant templates. The major alleles observed in all tissues tested were 150 and 154 bp fragments (see online supplementary figure S6A); expansion alleles arising from locus instability sized at 158 and 162 bp were detected in some cells (see online supplementary figure S6B). We observed a difference in D17S1307 instability between the two individuals, with greater instability observed in the compound heterozygote bearing fully truncating mutations (see online supplementary figure S6C and table S5). This result supports the notion that subtle PMS2 expression from c.2002A>G contributes to the maintenance of genome stability at the nucleotide level.
Cancer development is virtually inevitable in the CMMRD syndrome, and the median age of cancer in all reported cases caused by bi-allelic truncating PMS2 mutations is 8 years (see online supplementary table S3). Here, we describe the identification and characterisation of a single bp change in PMS2 (c.2002A>G) that, when present in the homozygous state, results in a delayed onset of cancer compared with that seen in patients with bi-allelic PMS2 truncating mutations. We also showed that the very small amount of full-length PMS2 protein produced functionally associates with its partner, MLH1, and cells possessing this residual expression displayed a milder hypermutable phenotype than cells carrying bi-allelic truncating mutations. Male mice lacking both copies of Pms2 are infertile,18 but the proband and two other male c.2002A>G homozygotes have confirmed biological children, consistent with a functional PMS2–MLH1 interaction being present in vivo.
NM_000535.5:c.2002A>G appears limited to Nunavik and the western coastline of Hudson Bay. The 2011 census reported a population of only 12 090 with 90% being Inuit. Sixty-four per cent of people are under age 30, compared with 36% in the rest of Quebec.19 Assuming random mating and that our cancer clinics identified all homozygotes, and using the fact that 11 of the homozygous persons are from Nunavik, then under Hardy–Weinberg equilibrium there should be approximately 670 Inuit persons in Nunavik who are heterozygous for the c.2002A>G variant (one in 16 in the population). This variant is among the most common cancer-associated alleles reported in any population and, given the current age structure of the Nunavik Inuit population, there are important public health implications from these findings.
NM_000535.5:c.2002A>G is a founder mutation in the Inuit people and population-specific gene–gene and gene–environment interactions are possible mechanisms underlying the attenuated CMMRD phenotype we observed. However, the impact of these modifying factors tends to be subtle and therefore the associated phenotype variation would be evident only in a large patient cohort. With a significant effect detected in only 13 homozygotes, a protective role by the residual expressivity from the c.2002A>G mutant allele is the most likely explanation for the significantly delayed median age of cancer onset. Our observations suggest that restoring gene expression, even partially, as a cancer prevention strategy could be a viable and effective novel avenue for managing inherited cancer risk.
We thank Lidia Kasprzak MSc, François Plourde MSc and the late Jeremy Jass MD for their contributions to this study. We thank Dr François Rousseau for genotyping c.2002A>G in controls from Quebec City. Without the support of the population members from the villages of Puvirnituq, Inukjuak and Kuujjuarapik, this project would not have been possible. We particularly thank Mr A Kenuajak, Mayor of Puvirnituq, for his assistance.
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Files in this Data Supplement:
- Data supplement 1 - Online supplement
Contributors LL: experimental design, data acquisition and analysis, manuscript preparation; MC, AG, VAM, MJM, BC, AC, CS, AD, MDR, SBG, RAH, B-JF, DEG, JZ, KW, BY: data acquisition, critical review of manuscript; NH: data acquisition and analysis, manuscript preparation; GC: data acquisition and analysis, critical revision of manuscript; MDT: study conception and design, acquisition of data, drafting of manuscript; WDF: study conception and design, data interpretation, drafting of manuscript.
Funding This work was supported by grants from the Canadian Gene Cure Foundation and the Canadian Cancer Society Research Institute (grant # 700252). LL received fellowship funding from the Systems Biology Training Program by the Canadian Institute of Health Research.
Competing interests None.
Ethics approval McGill University research ethics committee.
Provenance and peer review Not commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.