Abstract
The paper introduces a novel approach for defining efficient generative kernels for structured-data based on the concept of multisets and Jaccard similarity. The multiset feature-space allows to enhance the adaptive kernel with syntactic information on structure matching. The proposed approach is validated using an input-driven hidden Markov model for trees as generative model, but it is enough general to be straightforwardly applicable to any probabilistic latent variable model. The experimental evaluation shows that the proposed Jaccard kernel has a superior classification performance with respect to the Fisher Kernel, while consistently reducing the computational requirements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems, pp. 487–493 (1999)
Collins, M., Duffy, N.: New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In: Proc. of the 40th Annual Meeting on Assoc. for Comput. Ling., pp. 263–270 (2002)
Bacciu, D., Micheli, A., Sperduti, A.: Input-output hidden markov models for trees. In: Verleysen, M. (ed.) Proc. of the 2012 Europ. Symp. on Artif. Neural Netw., Comput. Intell. and Machine Learning (ESANN), pp. 25–30 (2012)
Jaccard, P.: The distribution of the flora in the alpine zone. 1. New Phytologist 11(2), 37–50 (1912)
Nicotra, L., Micheli, A., Starita, A.: Fisher kernel for tree structured data. In: Proc. of the 2004 Int. Joint Conf. on Neural Netw., vol. 3, pp. 1917–1922 (2004)
Nicotra, L., Micheli, A.: Generative Kernels for Gene Function Prediction Through Probabilistic Tree Models of Evolution. Artificial Intelligence in Medicine (45), 125–134 (2009)
Diligenti, M., Frasconi, P., Gori, M.: Hidden tree markov models for document image classification. IEEE Trans. Pattern Anal. Mach. Intell. 25(4), 519–523 (2003)
Jebara, T., Kondor, R., Howard, A.: Probability product kernels. The Journal of Machine Learning Research 5, 819–844 (2004)
Denoyer, L., Gallinari, P.: Report on the XML mining track at INEX 2005 and INEX 2006: categorization and clustering of XML documents. SIGIR Forum 41(1), 79–90 (2007)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), Software, http://www.csie.ntu.edu.tw/~cjlin/libsvm
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bacciu, D., Micheli, A., Sperduti, A. (2012). A Generative Multiset Kernel for Structured Data. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds) Artificial Neural Networks and Machine Learning – ICANN 2012. ICANN 2012. Lecture Notes in Computer Science, vol 7552. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33269-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-33269-2_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33268-5
Online ISBN: 978-3-642-33269-2
eBook Packages: Computer ScienceComputer Science (R0)