Sequence and Structural Analyses for Functional Non-coding RNAs
Analysis and detection of functional RNAs are currently important topics in both molecular biology and bioinformatics research. Several computational methods based on stochastic context-free grammars (SCFGs) have been developed for modeling and analysing functional RNA sequences. These grammatical methods have succeeded in modeling typical secondary structures of RNAs and are used for structural alignments of RNA sequences. Such stochastic models, however, are not sufficient to discriminate member sequences of an RNA family from non-members, and hence to detect non-coding RNA regions from genome sequences. Recently, the support vector machine (SVM) and kernel function techniques have been actively studied and proposed as a solution to various problems in bioinformatics. SVMs are trained from positive and negative samples and have strong, accurate discrimination abilities, and hence are more appropriate for the discrimination tasks. A few kernel functions that extend the string kernel to measure the similarity of two RNA sequences from the viewpoint of secondary structures have been proposed. In this article, we give an overview of recent progress in SCFG-based methods for RNA sequence analysis and novel kernel functions tailored to measure the similarity of two RNA sequences and developed for use with support vector machines (SVM) in discriminating members of an RNA family from non-members.
KeywordsSupport Vector Machine Structural Alignment tRNA Sequence String Kernel Typical Secondary Structure
Unable to display preview. Download preview PDF.
- 12.Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D (2002) The human genome browser at UCSC. Genome Res 12:996–1006 Google Scholar
- 13.Kin T, Tsuda K, Asai K (2002) Marginalized kernels for RNA sequence data analysis. Genome Inform Ser Workshop Genome Inform 13:112–122 Google Scholar
- 16.Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learning, pp 282–289 Google Scholar
- 26.Sakakibara Y, Asai K, Sato K (2007) Stem kernels for RNA sequence analyses. In: 1st international conference on bioinformatics research and development (BIRD 2007). Lecture notes in bioinformatics, vol 4414. Springer, Berlin, pp 278–291 Google Scholar
- 30.Schölkopf B, Tsuda K, Vert JP (2004) Kernel methods in computational biology. MIT Press, Cambridge Google Scholar
- 31.Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge Google Scholar
- 34.Tsuda K, Kin T, Asai K (2002) Marginalized kernels for biological sequences. Bioinformatics 18(Suppl 1):S268–S275. Google Scholar