Abstract
Generative kernels represent theoretically grounded tools able to increase the capabilities of generative classification through a discriminative setting. Fisher Kernel is the first and mostly-used representative, which lies on a widely investigated mathematical background. The manufacture of a generative kernel flows down through a two-step serial pipeline. In the first, “generative” step, a generative model is trained, considering one model for class or a whole model for all the data; then, features or scores are extracted, which encode the contribution of each data point in the generative process. In the second, “discriminative” part, the scores are evaluated by a discriminative machine via a kernel, exploiting the data separability. In this paper we contribute to the first aspect, proposing a novel way to fit the class-data with the generative models, in specific, focusing on Hidden Markov Models (HMM). The idea is to perform model clustering on the unlabeled data in order to discover at best the structure of the entire sample set. Then, the label information is retrieved and generative scores are computed. Experimental, comparative test provides a preliminary idea on the goodness of the novel approach, pushing forward for further developments.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. John Wiley & Sons, Chichester (2001)
Fine, S., Singer, Y., Tishby, N.: The hierarchical hidden markov model: Analysis and applications. Machine Learning 32, 41–62 (1998)
Ghahramani, Z., Jordan, M.: Factorial hidden markov models. Machine Learning 29, 245–273 (1997)
Brand, M., Oliver, N., Pentland, A.: Coupled hidden markov models for complex action recognition. In: Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (1997)
Bahl, L., Brown, P., de Souza, P., Mercer, R.: Maximum mutual information estimation of hidden Markov model parameters for speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Tokyo, Japan, vol. I, pp. 49–52 (2000)
Kaiser, Z., Horvat, B., Kacic, Z.: A novel loss function for the overall risk criterion based discriminative training of HMM models. In: International Conference on Spoken Language Processing, Beijing, China, vol. 2, pp. 887–890 (2000)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labelling sequence data. In: International Conference on Machine Learning, pp. 591–598 (2001)
Gunawardana, A., Mahajan, M., Acero, A., Platt, J.: Hidden conditional random fields for phone classication. In: Interspeech, Lisbon, Portugal, pp. 1117–1120 (2005)
Ng, A., Jordan, M.: On discriminative vs generative classifiers: A comparison of logistic regression and naive Bayes. In: Advances in Neural Information Processing Systems (2002)
Bicego, M., Murino, V., Figueiredo, M.: Similarity-based classification of sequences using hidden markov models. Pattern Recognition 37(12), 2281–2291 (2004)
Bicego, M., Pękalska, E., Duin, R.P.W.: Group-induced vector spaces. In: Haindl, M., Kittler, J., Roli, F. (eds.) MCS 2007. LNCS, vol. 4472, pp. 190–199. Springer, Heidelberg (2007)
Layton, M., Gales, M.: Augmented statistical models: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems (2005)
Smith, N.: Using Augmented Statistical Models and Score Spaces for Classification. PhD thesis, Engineering Departement, Cambridge University (2003)
Bicego, M., Pekalska, E., Tax, D., Duin, R.: Component-based discriminative classification for hidden markov models. Pattern Recognition (in press, 2009)
Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems, pp. 487–493 (1999)
Tsuda, K., Kin, T., Asai, K.: Marginalised kernels for biological sequences. Bioinformatics 18, 268–275 (2002)
Jebara, T., Kondor, I., Howard, A.: Probability product kernels. Journal of Machine Learning Research 5, 819–844 (2004)
Moreno, P., Ho, P., Vasconcelos, N.: A kullback-leibler divergence based kernel for svm classification in multimedia applications. In: Proc. of Advances in Neural Information Processing., vol. 16 (2003)
Fine, S., Navratil, J., Gopinath, R.: A hybrid gmm/svm approach to speaker identification. In: IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 417–420 (2001)
Smith, N., Gales, M.: Speech recognition using svms. In: Advances in Neural Information Processing Systems, pp. 1197–1204 (2002)
Chen, L., Man, H., Nefian, A.: Face recognition based on multi-class mapping of fisher scores. Pattern Recognition, 799–811 (2005)
Rabiner, L.: A tutorial on Hidden Markov Models and selected applications in speech recognition. Proc. of IEEE 77(2), 257–286 (1989)
Amari, S.: Natural gradient works efficiently in learning. Neural Computation 10, 251–276 (1998)
Rabiner, L., Lee, C., Juang, B., Wilpon, J.: HMM clustering for connected word recognition. In: Proc. Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pp. 405–408 (1989)
Lee, K.: Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition. IEEE Transactions on Acoustics, Speech and Signal Processing 38(4), 599–609 (1990)
Kosaka, T., Matsunaga, S., Kuraoka, M.: Speaker-independent phone modeling based on speaker-dependent hmm’s composition and clustering. In: Int. Proc. on Acoustics, Speech, and Signal Processing, vol. 1, pp. 441–444 (1995)
Li, C.: A Bayesian Approach to Temporal Data Clustering using Hidden Markov Model Methodology. PhD thesis, Vanderbilt University (2000)
Li, C., Biswas, G.: Clustering sequence data using hidden Markov model representation. In: Proc. of SPIE 1999 Conf. on Data Mining and Knowledge Discovery: Theory, Tools, and Technology, pp. 14–21 (1999)
Li, C., Biswas, G.: A bayesian approach to temporal data clustering using hidden Markov models. In: Proc. Int. Conf. on Machine Learning, pp. 543–550 (2000)
Li, C., Biswas, G.: Applying the Hidden Markov Model methodology for unsupervised learning of temporal data. Int. Journal of Knowledge-based Intelligent Engineering Systems 6(3), 152–160 (2002)
Li, C., Biswas, G., Dale, M., Dale, P.: Matryoshka: A HMM based temporal data clustering methodology for modeling system dynamics. Intelligent Data Analysis Journal (2002)
Smyth, P.: Clustering sequences with hidden Markov models. In: Mozer, M., Jordan, M., Petsche, T. (eds.) Advances in Neural Information Processing Systems, vol. 9, p. 648. MIT Press, Cambridge (1997)
Cadez, I., Gaffney, S., Smyth, P.: A general probabilistic framework for clustering individuals. In: Proc. of ACM SIGKDD 2000 (2000)
Law, M., Kwok, J.: Rival penalized competitive learning for model-based sequence. In: Proc. Int. Conf. Pattern Recognition, vol. 2, pp. 195–198 (2000)
Bicego, M., Murino, V., Figueiredo, M.: Similarity-based clustering of sequences using hidden Markov models. In: Perner, P., Rosenfeld, A. (eds.) MLDM 2003. LNCS (LNAI), vol. 2734, pp. 86–95. Springer, Heidelberg (2003)
Panuccio, A., Bicego, M., Murino, V.: A hidden markov model-based approach to sequential data clustering. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 734–742. Springer, Heidelberg (2002)
Bahlmann, C., Burkhardt, H.: Measuring hmm similarity with the bayes probability of error and its application to online handwriting recognition. In: Proc. Int. Conf. Document Analysis and Recognition, pp. 406–411 (2001)
Jain, A., Dubes, R.: Algorithms for clustering data. Prentice-Hall, Englewood Cliffs (1988)
He, Y., Kundu, A.: 2-D shape classification using Hidden Markov Model. IEEE Trans. Pattern Analysis Machine Intelligence 13(11), 1172–1184 (1991)
Arica, N., Yarman-Vural, F.: A shape descriptor based on circular Hidden Markov Model. In: IEEE Proc. Int Conf. Pattern Recognition, vol. 1, pp. 924–927 (2000)
Bicego, M., Murino, V.: Investigating Hidden Markov Models’ capabilities in 2D shape classification. IEEE Trans. on Pattern Analysis and Machine Intelligence - PAMI 26(2), 281–286 (2004)
Mollineda, R., Vidal, E., Casacuberta, F.: Cyclic sequence alignments: Approximate versus optimal techniques. Int. Journal of Pattern Recognition and Artificial Intelligence 16(3), 291–299 (2002)
Neuhaus, M., Bunke, H.: Edit distance-based kernel functions for structural pattern classification. Pattern Recognition 39, 1852–1863 (2006)
Bicego, M., Trudda, A.: 2D shape classification using multifractional brownian motion. In: da Vitoria Lobo, N., Kasparis, T., Roli, F., Kwok, J.T., Georgiopoulos, M., Anagnostopoulos, G.C., Loog, M. (eds.) S+SSPR 2008. LNCS, vol. 5342, pp. 906–916. Springer, Heidelberg (2008)
Perina, A., Cristani, M., Castellani, U., Murino, V.: A new generative feature set based on entropy distance for discriminative classification. In: Proc. of Int. Conf. on Image Analysis and Processing, ICIAP 2009 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bicego, M., Cristani, M., Murino, V., Pękalska, E., Duin, R.P.W. (2009). Clustering-Based Construction of Hidden Markov Models for Generative Kernels. In: Cremers, D., Boykov, Y., Blake, A., Schmidt, F.R. (eds) Energy Minimization Methods in Computer Vision and Pattern Recognition. EMMCVPR 2009. Lecture Notes in Computer Science, vol 5681. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03641-5_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-03641-5_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03640-8
Online ISBN: 978-3-642-03641-5
eBook Packages: Computer ScienceComputer Science (R0)