Statistics from Altmetric.com
Hereditary non-polyposis colorectal cancer (MIM 114500) is the most common inherited colorectal cancer syndrome, affecting 1 in 1000 people. Patients with hereditary non-polyposis colorectal cancer show predisposition to an early onset of synchronous and metachronous colorectal cancers in association with a variety of other extra-intestinal malignancies. The disease is caused by germline mutations in one of the five mismatch repair genes (hMLH1, hMSH2, hMSH6, hPMS2, hMLH3).1–6 A common mechanism of mutation in hereditary non-polyposis colorectal cancer is the disruption of hMSH2 and hMLH1 splicing by exon skipping.7,8 In addition, recent reports have shown aberrant splicing to occur even in normal individuals without hereditary cancer predisposition.9–11 However apart from the mutations in the splicing donor and acceptor sites, the effects on splicing of other sequence variations found in a patient are difficult to predict. Exonic sequences have also been shown to affect splicing efficiency; in fact any single base change in these sequences may have potential pathogenic consequences leading to aberrant splicing or exon skipping.12–14 Different mechanisms have been proposed to explain how exonic sequences mediate splicing regulation. Recent reports show that aberrant splicing may occur as a consequence of mutations that disrupt exonic splicing enhancers (ESEs) or create exonic splicing silencers.15–17 Exonic enhancers have been shown to interact specifically with serine or arginine rich (SR) proteins that regulate the splicing process, promoting exon definition by direct recruitment of spliceosome and regulatory proteins or antagonising the action of nearby silencer elements.18,19 Different classes of ESE consensus motifs have been described, but they are not always easily identified. Recently two web based resources ESEfinder (http://exon.cshl.org/ESE)20 and RESCUE-ESE (http://genes.mit.edu/burgelab/rescue-ese)21 have been developed to facilitate rapid analysis of exon sequences, to identify putative ESEs responsive to the human SR proteins, and to predict whether exonic mutations disrupt such elements. These algorithms can identify putative ESEs in most human exons. A consequence of these findings is that a significant fraction of exonic mutations might be causative of disease because they represent unrecognised splicing alterations. So far, most of the exonic mutations have been assumed to cause disease by affecting only the coding potential, silent mutations have been ignored as causes of disease, missense mutations have been inferred to alter protein function, and nonsense mutations have been assumed to lead to synthesis of truncated nonfunctional proteins or loss of function due to nonsense mediated decay. In contrast with this view, the perception is emerging that a significant number of point mutations or polymorphisms associated with disease lead or may lead to aberrant splicing.12 To test whether a disease causing mutation affects splicing, direct analysis of mRNA linear structure and either in vivo or in vitro splicing assays need to be performed. We have previously reported a nonsense mutation, K461X, in the human mismatch repair gene hMLH1 segregating with hereditary non-polyposis colorectal cancer in three different unrelated families.7 We have demonstrated that this mutation, resulting from a T→A transversion at nt 1422 in hMLH1 exon 12, leads to exon 12 skipping in an in vivo system.7 Since the mutation is located in a purine rich region 29 base pairs upstream of the splice donor site of exon 12 we hypothesised that this mutation disrupts an exonic splicing enhancer rich in purine. In the present study we report that mutation K461X effectively abolishes a putative exonic splicing enhancer for the SR protein SF2/ASF. In addition we have performed a site directed mutagenesis to disrupt all the potential SF2/ASF ESE sites identified from ESEfinder in hMLH1 exon 12.
Abnormalities of pre-mRNA splicing are progressively becoming recognised as an important mechanism by which gene mutations cause disease.
Recently web based resources have been developed to facilitate the identification of genomic changes not obviously involved in the splicing process.
Using an in vivo splicing assay, we analysed the effects on splicing of 15 different mutant constructs and the naturally occurring mutation K461X. All of these mutations abolish the consensus motifs for the splicing factor SF2/ASF, identified from the computer program ESEfinder in hMLH1 exon 12.
Our results suggest that even if exonic splicing enhancer (ESE) prediction programs can be a useful tool in identifying real enhancers, they can give rise to erroneous predictions. Therefore functional in vivo splicing assays should be mandatory for proper genetic diagnosis.
The minigene constructs were assembled in the pSPL3 vector (for details see fig 1). Wild-type sequences of exon 11, 12, and 13 and their corresponding flanking intronic sequences were amplified from human genomic DNA, using EcoRI and XhoI, XhoI and BamHI, and BamHI and NdeI tagged primers (primer sequences are available on request). Single nucleotide substitutions were introduced in exon 12 by overlap extension PCR with primers tagged with XhoI and BamHI restriction sites.
Analysis of minigene expression
The different constructs were transiently transfected in Cos-7 cells with Metafectene (Biontex) in a 6 well plate. After 48 hours total RNA was collected and extracted with TRIzol (Life Technologies). The cDNA was synthesised using MMLV-H− point mutation reverse transcriptase (Promega Inc.) according to the instructions provided. The RT-PCR was performed with EXPAND™ Long Template (Roche) using the vector specific primers SD6 (5′-TCT GAG TCA CCT GGA CAA CC-3′) and SA2 (5′-ATC TCA GTG GTA TTT GTG AGC-3′). The thermocycling conditions were as follows: 95°C for 2 min; 95°C for 20 s, 58°C for 30 s, and 68°C for 1 min 30 s for 10 cycles; 95°C for 20 s, 58°C for 30 s, and 68°C for 1 min 30 s incremented 5 s per cycle for 25 cycles.
We used the algorithm ESEfinder20 to ascertain if the mutation K461X lies in and eventually abrogates a high score ESE motif. Using this program we established that the T→A substitution at nt 1422 is located within three overlapping motifs for the SR protein SF2/ASF2. In addition the mutation reduces one of these three motif scores to a value well below its respective threshold (fig 2A). As in some cases the presence of a nonsense mutation has been associated with exon skipping through a mechanism of nuclear reading frame scanning that would prevent the inclusion in the mature mRNA of exons containing a nonsense codon (nonsense altered splicing), we decided to uncouple the effect of disrupting the SF2/ASF ESE from an eventual nonsense altered splicing phenomenon, introducing two different mutations: the missense mutation K461N and the insertion 1421-1422insT which creates a TAA nonsense mutation in codon 461. Both of these mutations abolish the second of three overlapping SF2/ASF motifs similarly to the mutation K461X, but the K461N mutation creates a new enhancer motif for the splicing factor SC35 (fig 2A). The in vivo splicing assay showed that the mutation 1421-1422insT leads to the aberrant splicing of hMLH1 exon 12 in the same way as K461X while missense mutation K461N does not alter exon 12 splicing (fig 2B). These results apparently suggested that a correlation exists between coding effects of substitutions introduced in this splicing enhancer (that is, substitutions that do or do not create a stop codon) and their ability to affect exon recognition. Alternatively, the lack of effects on splicing in the presence of the K461N mutation can be explained by the generation of a novel SC35 consensus sequence that masks the effect of the SF2/ASF motif disruption. To determine whether any nucleotide substitution leading to the abrogation of predicted SR protein score matrices effectively causes aberrant splicing we decided to perform an extensive site directed mutagenesis. First we used ESEfinder to identify putative ESE motifs for the human SR protein SF2/ASF. With this approach 16 consensus motifs for the SF2/ASF protein were predicted throughout hMLH1 exon 12 (fig 3A). The mutations K461X, K461N, and 1421-1422insT all abolish the SF2/ASF motif 15. Then we identified 13 ESE disrupting mutations in each of the consensus motifs with the exclusion of motifs 6 and 8 since no nucleotide substitution was able to abrogate these two motif scores without creating a new consensus motif for other SR proteins. All the 13 mutations decreased the SF2/ASF motif scores below the computed threshold (fig 3B) without creating novel consensus motifs for the other splicing factors, with the only exception of the mutagenesis of SF2/ASF motif 3, which generates a consensus sequence for the SRp40 protein. To verify that the mutations introduced effectively abrogated the correct splicing of exon 12, we transfected Cos-7 cells with the 13 mutated constructs. Following transfection the mRNA isolated from the Cos-7 was analysed for splicing pattern using RT-PCR. As shown in fig 4, a normal ≈930 bp product was present in all the samples, except those transfected with mutations decreasing or abolishing ESE motifs 14, 15 (disrupted by both the naturally occurring mutation K461X and the induced 1421-1422insT), and 16. The complex pattern of splicing observed in these mutants, showing skipping of exon 11/13 (SF16) or exon 11/12 (SF15), can be explained considering that alternative splicing involving exon 11 has been described even in normal individuals,9 thus suggesting that some splice site leakiness is present in the region. Therefore the in vivo splicing assay demonstrated that the majority of the mutants tested could include and splice hMLH1 exon 12 correctly even if the changes introduced were predicted to decrease the SF2/ASF protein matrix score below its threshold. These results suggest that, at least in this specific exonic context, the currently available matrices could not exactly predict the effects of individual changes on the splicing efficiency. While this study was in progress a second computational method, RESCUE-ESE, was developed to predict sequences with putative ESE activity.21 According to this method, specific hexanucleotide sequences are identified as candidate ESEs based on the observation that their frequency is significantly higher in exons than in introns and also significantly higher in exons with weak splicing sites than in exons with strong splicing sites.
Using this second computational method 41 sequences were identified as putative ESEs. Six ESEs identified by ESEfinder were not recognised as ESE from the RESCUE-ESE algorithm. In four cases (motifs SF5, SF7, SF10, and SF15) the site directed mutagenesis abolished the candidate ESEs for ASF/SF2 identified by both ESEfinder and RESCUE-ESE programs (fig 5). Even in these latter cases, with the exception of the SF15 motif (which is the one abolished by the naturally occurring K461X mutation), in vivo splicing assay failed to reveal any abnormal splicing product (fig 4).
The precision and correctness of intron removal during pre-mRNA splicing are largely dependent on the recognition of several discrete elements some of which, as the splicing donor and acceptor sites, are almost invariant. However, many other loosely defined cis acting elements, such as the polypyrimidine tract, the branch site, and several other both exonic and intronic sequences may contribute to exon recognition.
In this study we performed site directed mutagenesis to abolish all the motifs identified in hMLH1 exon 12 for the SR protein SF2/ASF using the ESEfinder program, and tested the effect of each mutation on the splicing efficiency in an in vivo assay rather than evaluating the effect of naturally occurring mutations on the splicing efficiency through their ability to disrupt computer identified ESEs. Sixteen exonic splicing enhancer inactivating mutations were tested and only four were able to definitely abrogate exon 12 inclusion. Two nonsense mutations abolishing ESE motif SF15 (the spontaneously occurring K461X and the induced 1421-1422insT), but not the missense mutation K461N in the same motif, lead to aberrant splicing of exon 12. This result might be explained as a consequence of nonsense associated altered splicing. However the skipping of exon 12 is out of frame and would lead to a premature termination stop codon 21 nucleotides downstream. Therefore there would be no obvious advantage in the selective exclusion of the exon 12 even when it harbours premature termination codons.
Regarding the complex pattern of splicing observed in the mutants SF14, SF15(K461X), SF16, and 1421-1422insT, it has to be said that alternatively spliced isoforms of hMLH1 lacking exons 9 and 10, exons 10 and 11, and exons 9, 10, and 11 have been reported.9,10 Therefore the different RT-PCR products observed may be caused by composite regulatory splicing elements present in both constitutive and alternative exons in this region.
According to the results of the in vivo splicing assay used in this study, the matrices currently available cannot reliably predict the effects of individual changes on the splicing efficiency, although they can give hints on where real exonic splicing enhancers reside.
Recent studies demonstrated that nucleotide substitutions, located in exonic splicing enhancers identified by computer based methods, might be causative of exon skipping.12 However putative exonic splicing enhancers are identified by the two aforementioned methods on the basis of replacement of natural enhancers present in a reporter construct with short oligonucleotides. Therefore their ability to rescue splicing is evaluated in a different exonic context. As a consequence, the influence of natural neighbour sequences on splicing proficiency might be underestimated. Several explanations can account for the lack of association between mutations introduced in predicted ESEs and splicing disruption. First, the hMLH1 exon 12 is a rather large exon with almost double the size of the average human exons (370 base pairs, as against 180 base pairs). Exonic splicing enhancers generally tend to occur in small loosely defined exons which are also the preferred substrate for testing the ability to enhance splicing of randomly chosen short oligonucleotides. Therefore it might be that predicted exonic splicing enhancers overlap with true splicing enhancers only when they lie in “weak” exons. Second, among the 16 SF2/ASF motifs identified by ESEfinder only SF14, SF15, and SF16 are located less than 40 bp away from the 5′ or the 3′ ends of hMLH1 exon 12 and, interestingly, only mutations of these three motifs cause aberrant splicing. Functional ESEs have also been demonstrated to reside in specific positions relative to the 5′ or 3′ ends of an exon.22,23 Hence it follows that, if functional enhancers are preferentially located in the close proximity of exon-intron borders, then ESEs identified in the middle of an exon cannot be automatically considered true enhancers. Third, efficient splicing is the result of a plethora of rather complex and often antagonistic interactions mediated by different splicing factors each binding to its proper target sequence. Fourth, some particular exonic splicing enhancers could be used in a cell specific manner even if the splicing pattern did not change appreciably when the same constructs were analysed in different cell lines.24 Several lines of evidence suggest that juxtaposed enhancer and silencer elements concur to regulate either exon inclusion or skipping,24 while score matrices of pure substrates binding SR proteins have been used to calculate SR proteins. It is therefore difficult to evaluate the effect of a single nucleotide substitution whenever a composite exonic context is present.
An increasing number of point mutations have been recently reported to prevent correct splicing by disrupting exonic splicing enhancers. Missense mutations that do not alter protein function as well as silent substitutions can also affect pre-mRNA splicing and have unforeseen pathological consequences. Therefore when this type of sequence alterations is found while searching for disease associated mutations, their potential effect on splicing cannot be ignored. However the predictive capacity of the SR protein score matrices may be low in a specific exonic context as suggested by our results or when composite elements with overlapping enhancer and silencer elements are present. Thus the pathogenicity of any point mutation cannot be simply based on its ability to abrogate or decrement the SR protein score matrices since these ESE prediction programs may lead to false positive results.
A very recent report25 has demonstrated that pathogenic missense mutations in both hMLH1 and hMSH2 tend to colocalise with ESE more frequently than expected and to decrease ESE scores. On these bases the authors propose that the pathogenicity of these mutations might be splicing related. Our study suggests that the effect of the splicing disruption of mutations predicted to abolish ESEs is not obvious and should be considered with caution.
In conclusion because of the extreme complexity of the splicing machinery, mutations altering regulatory sequences such as the exonic splicing enhancers, although relevant to human diseases, demand appropriate functional splicing assays to specifically assess their role in pre-mRNA splicing fidelity and accuracy.
We would like to thank Prof. Adrian Krainer for the kind hospitality in his laboratory at the beginning of this work, and Adrian Krainer and Luca Cartegni for helpful advice and discussions.
This research was supported, in part, by a grant from Ministero Istruzione Università Ricerca FIRB no RBAU01SZHB (to GG).
Conflicts of interest: none declared.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.