Computational searches for splicing signals

Xiang H-F Zhang; Christina S Leslie; Lawrence A Chasin

doi:10.1016/j.ymeth.2005.07.011

Computational searches for splicing signals

Methods. 2005 Dec;37(4):292-305. doi: 10.1016/j.ymeth.2005.07.011.

Authors

Xiang H-F Zhang¹, Christina S Leslie, Lawrence A Chasin

Affiliation

¹ Department of Biological Sciences, Columbia University, New York, NY 10027, USA.

PMID: 16314258
DOI: 10.1016/j.ymeth.2005.07.011

Abstract

The removal of introns from pre-mRNA requires as an initial event the accurate molecular recognition of the proper exon-intron borders. It is now evident that RNA sequence elements in addition to the consensus splice site sequences themselves are required for this recognition. Genomic analyses have contributed to the definition of these elements as exonic and intronic splicing enhancers and silencers, comprising what has been called the "splicing code." Many computational methods have been brought to bear in such studies. We describe here some of the methods we have used to discover functional splicing signals. What these methods have in common is a comparison of sequences in and around exons to sequences found elsewhere in the genome. We have especially made use of comparisons to "pseudo exons," intronic sequences resembling exons by virtue of being bounded by sequences indistinguishable from splice sites. Two computational strategies are emphasized: (1) the use of a machine learning technique in which a computational algorithm, a support vector machine, is first trained on known examples and then used to predict sequences associated with splicing; and (2) straight statistical analysis of differences between regions associated with exons and other regions in the genome. In most cases, the predictions made using these methods have been validated by subsequent empirical tests. An attempt has been made to make this description understandable by researchers unfamiliar with computational practice and to include practical references to specific databases and programs.

Publication types

Comparative Study

MeSH terms

Alternative Splicing / genetics*
Artificial Intelligence
Computational Biology / methods*
Computational Biology / statistics & numerical data
Exons
Genetic Code
Humans
Introns
Sequence Alignment
Software
Statistics as Topic