Segment-Level Probabilistic Sequence Kernel Based Support Vector Machines for Classification of Varying Length Patterns of Speech

  • Shikha Gupta
  • Veena Thenkanidiyoor
  • Dileep Aroor DineshEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9950)


In this work we propose the segment-level probabilistic sequence kernel (SLPSK) as dynamic kernel to be used in support vector machine (SVM) for classification of varying length patterns of long duration speech represented as sets of feature vectors. SLPSK is built upon a set of Gaussian basis functions, where half of the basis functions contain class specific information while the other half implicates the common characteristics of all the speech utterances of all classes. The proposed kernel is computed between the pair of examples, by partitioning the speech signal into fixed number of segments and then matching the corresponding segments. We study the performance of the SVM-based classifiers using the proposed SLPSK using different pooling technique for speech emotion recognition and speaker identification and compare with that of the SVM-based classifiers using other kernels for varying length patterns.


Feature Vector Speech Signal Gaussian Mixture Model Speaker Recognition Speaker Identification 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Dileep, A.D., Chandra Sekhar, C.: GMM-based intermediate matching kernel for classification of varying length patterns of long duration speech using support vector machines. IEEE Trans. Neural Netw. Learn. Syst. 25(8), 1421–1432 (2014)CrossRefGoogle Scholar
  2. 2.
    Smith, N., Gales, M., Niranjan, M.: Data-dependent kernels in SVM classification of speech patterns. Technical report CUED/F-INFENG/TR.387, Cambridge University Engineering Department, Trumpington Street, Cambridge, CB2 1PZ, U.K., April 2001Google Scholar
  3. 3.
    Lee, K-A., You, C.H., Li, H., Kinnunen, T.: A GMM-based probabilistic sequence kernel for speaker verification. In: Proceedings of INTERSPEECH, Antwerp, Belgium, pp. 294–297, August 2007Google Scholar
  4. 4.
    Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Signal Process. Lett. 13(5), 308–311 (2006)CrossRefGoogle Scholar
  5. 5.
    You, C.H., Lee, K.A., Li, H.: An SVM kernel with GMM-supervector based on the Bhattacharyya distance for speaker recognition. IEEE Signal Process. Lett. 16(1), 49–52 (2009)CrossRefGoogle Scholar
  6. 6.
    Dileep, A.D., Chandra Sekhar, C.: Speaker recognition using pyramid match kernel based support vector machines. Int. J. Speech Technol. 15(3), 365–379 (2012)CrossRefGoogle Scholar
  7. 7.
    Sachdev, A., Dileep, A.D., Thenkanidiyoor, V.: Example-specific density based matching kernel for classificationof varying length patterns of speech using support vector machines. In: Proceedings of ICONIP, Istanbul, Turkey, pp.177–184, November 2015Google Scholar
  8. 8.
    Yu, K., Lv, F., Huang, T., Wang, J., Yang, J., Gong, Y.: Locality-constrained linear coding for image classification. In: Proceedings of CVPR 2010, pp. 3360–3367. IEEE (2010)Google Scholar
  9. 9.
    Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of CVPR 2009, pp. 1794–1801. IEEE (2009)Google Scholar
  10. 10.
    Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)Google Scholar
  11. 11.
    Burkhardt, F., Paeschke, A., Rolfes, M., Weiss, W.S.B.: A database of German emotional speech. In: Proceedings of INTERSPEECH, Lisbon, Portugal, pp. 1517–1520, September 2005Google Scholar
  12. 12.
    Steidl, S.: Automatic classification of emotion-related user states inspontaneous childern’s speech. Ph.D. thesis, Der Technischen Fakultät der Universität Erlangen-Nürnberg, Germany (2009)Google Scholar
  13. 13.
    The NIST year 2003 speaker recognition evaluation plan (2003).

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Shikha Gupta
    • 1
  • Veena Thenkanidiyoor
    • 2
  • Dileep Aroor Dinesh
    • 1
    Email author
  1. 1.School of Computing and Electrical EngineeringIndian Institute of Technology MandiMandiIndia
  2. 2.Department of Computer Science and EngineeringNational Institute of Technology GoaPondaIndia

Personalised recommendations