Abstract
Pairwise Rational Kernels (PRKs) are the combination of pairwise kernels, which handle similarities between two pairs of entities, and rational kernels, which are based on finite-state transducer for manipulating sequence data. PRKs have been already used in bioinformatics problems, such as metabolic network prediction, to reduce computational costs in terms of storage and processing.
In this paper, we propose new Pairwise Rational Kernels based on automaton and transducer operations. In this case, we define new operations over pairs of automata to obtain new rational kernels. We develop experiments using these new PRKs to predict metabolic networks. As a result, we obtain better accuracy and execution times when we compare them with previous kernels.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ben-Hur, A., Noble, W.S.: Kernel methods for predicting protein–protein interactions. Bioinformatics 21(suppl. 1), i38–i46 (2005)
Tsuda, K., Noble, W.S.: Learning kernels from biological networks by maximizing entropy. Bioinformatics 20(suppl. 1), i326–i333 (2004)
Yamanishi, Y.: Supervised inference of metabolic networks from the integration of genomic data and chemical information. In: Elements of Computational Systems Biology, pp. 189–212. Wiley (2010)
Yamanishi, Y., Araki, M., Gutteridge, A., Honda, W., Kanehisa, M.: Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 24(13), i232–i240 (2008)
O’Madadhain, J., Hutchins, J., Smyth, P.: Prediction and ranking algorithms for event-based network data. ACM SIGKDD Explorations Newsletter 7(2), 23–30 (2005)
Taskar, B., Wong, M.F., Abbeel, P., Koller, D.: Link prediction in relational data. In: Advances in Neural Information Processing Systems (2003)
Lothaire, M.: Applied Combinatorics on Words. Cambridge University Press (2005)
Allauzen, C., Mohri, M., Riley, M.: Statistical modeling for unit selection in speech synthesis. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, ACL 2004. Association for Computational Linguistics, Stroudsburg (2004)
Albert, J., Kari, J.: Digital image compression. In: Handbook of weighted automata, EATCS Monographs on Theoretical Computer Science. Springer (2009)
Holmes, I.: Using guide trees to construct multiple-sequence evolutionary hmms. Bioinformatics 19, i147–i157 (2003)
Westesson, O., Lunter, G., Paten, B., Holmes, I.: Phylogenetic automata, pruning, and multiple alignment. arXiv preprint arXiv:1103.4347 (2011)
Bradley, R.K., Holmes, I.: Transducers: An emerging probabilistic framework for modeling indels on trees. Bioinformatics 23(23), 3258–3262 (2007)
Cortes, C., Mohri, M.: Learning with weighted transducers. In: Proceedings of the 2009 Conference on Finite-State Methods and Natural Language Processing: Post-Proceedings of the 7th International Workshop FSMNLP 2008, pp. 14–22. IOS Press, Amsterdam (2009)
Roche-Lima, A., Domaratzki, M., Fristensky, B.: Metabolic network prediction through pairwise rational kernels. Submitted BMC Bioinformatics (April 2014)
Kosiol, C., Holmes, I., Goldman, N.: An empirical codon model for protein sequence evolution. Molecular Biology and Evolution 24(7), 1464–1479 (2007)
Roche-Lima, A., Oncina, J.: Bioinformatics applied to genetic study of rumen microorganism. In: 2nd Conference of IT in Agriculture Scienc, Havana, Cuba (November 2007)
Allauzen, C., Mohri, M., Talwalkar, A.: Sequence kernels for predicting protein essentiality. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 9–16. ACM, New York (2008)
Zien, A., Ratsch, G., Mika, S., Schalkopf, B., Lengauer, T., Macller, K.R.: Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics (Oxford, England) 16(9), 799–807 (2000)
Kuang, R., Ie, E., Wang, K., Wang, K., Siddiqi, M., Freund, Y., Leslie, C.: Profile-based string kernels for remote homology detection and motif extraction. Journal of Bioinformatics and Computational Biology 3(3), 527 (2005)
Leslie, C.S., Eskin, E., Cohen, A., Weston, J., Noble, W.S.: Mismatch string kernels for discriminative protein classification. Bioinformatics (Oxford, England) 20(4), 467–476 (2004)
Haussler, D.: Convolution kernels on discrete structures. Technical Report UCSCCRL-99-10. University of California at Santa Cruz. (1999)
Takimoto, E., Warmuth, M.: Path kernels and multiplicative updates. Journal of Machine Learning Research 4(5), 773–818 (2004)
Cortes, C., Haffner, P., Mohri, M.: Rational kernels: Theory and algorithms. J. Mach. Learn. Res. 5, 1035–1062 (2004)
Rabin, M.O., Scott, D.: Finite automata and their decision problems. IBM Journal of Research and Development 3(2), 114–125 (1959)
Mohri, M.: Weighted automata algorithms. In: Handbook of weighted automata, pp. 213–254. Springer (2009)
Lee, K.H., Lee, D., Lee, K., Kim, D.W.: Possibilistic support vector machines. Pattern Recognition 38(8), 1325–1327 (2005)
Moreau, Y.: Kernel methods for genomic data fusion. In: Sixth International Workshop on Machine Learning in Systems Biology (MLSB 2012), Basel, Switzerland (September 2012)
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. The Journal of Machine Learning Research 2, 419–444 (2002)
Brunner, C., Fischer, A., Luig, K., Thies, T.: Pairwise support vector machines and their application to large scale problems. Journal of Machine Learning Research 13, 2279–2292 (2012)
Kashima, H., Oyama, S., Yamanishi, Y., Tsuda, K.: Cartesian kernel: An efficient alternative to the pairwise kernel. IEICE TRANSACTIONS on Information and Systems 93(10), 2672–2679 (2010)
Kari, L.: On language equations with invertible operations. Theoretical Computer Science 132(1), 129–150 (1994)
Kanehisa, M., Araki, M., Goto, S., Hattori, M., Hirakawa, M., Itoh, M., Katayama, T., Kawashima, S., Okuda, S., Tokimatsu, T., et al.: KEGG for linking genomes to life and the environment. Nucleic Acids Research 36(suppl. 1), D480–D484 (2008)
Allauzen, C., Riley, M., Schalkwyk, J., Skut, W., Mohri, M.: OpenFST: A general and efficient weighted finite-state transducer library. In: Holub, J., Žďárek, J. (eds.) CIAA 2007. LNCS, vol. 4783, pp. 11–23. Springer, Heidelberg (2007)
Allauzen, C., Mohri, M.: Openkernel library (2012)
Yu, J., Guo, M., Needham, C.J., Huang, Y., Cai, L., Westhead, D.R.: Simple sequence-based kernels do not predict protein–protein interactions. Bioinformatics 26(20), 2610–2614 (2010)
Yamanishi, Y., Vert, J.P., Kanehisa, M.: Supervised enzyme network inference from the integration of genomic data and chemical information. Bioinformatics 21(suppl. 1), i468–i477 (2005)
Gomez, S.M., Noble, W.S., Rzhetsky, A.: Learning to predict protein–protein interactions from protein sequences. Bioinformatics 19(15), 1875–1881 (2003)
Gribskov, M., Robinson, N.L.: Use of receiver operating characteristic (ROC) analysis to evaluate sequence matching. Computers & Chemistry 20(1), 25–33 (1996)
McNemar, Q.: Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2), 153–157 (1947)
Bostanci, B., Bostanci, E.: An evaluation of classification algorithms using McNemar’s test. In: Bansal, J.C., Singh, P.K., Deep, K., Pant, M., Nagar, A.K. (eds.) Proceedings of Seventh International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA 2012). AISC, vol. 201, pp. 15–26. Springer, Heidelberg (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Roche-Lima, A., Domaratzki, M., Fristensky, B. (2014). Pairwise Rational Kernels Obtained by Automaton Operations. In: Holzer, M., Kutrib, M. (eds) Implementation and Application of Automata. CIAA 2014. Lecture Notes in Computer Science, vol 8587. Springer, Cham. https://doi.org/10.1007/978-3-319-08846-4_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-08846-4_25
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08845-7
Online ISBN: 978-3-319-08846-4
eBook Packages: Computer ScienceComputer Science (R0)