Example-Specific Density Based Matching Kernel for Classification of Varying Length Patterns of Speech Using Support Vector Machines

  • Abhijeet Sachdev
  • A. D. DileepEmail author
  • Veena Thenkanidiyoor
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9489)


In this paper, we propose example-specific density based matching kernel (ESDMK) for the classification of varying length patterns of long duration speech represented as sets of feature vectors. The proposed kernel is computed between the pair of examples, represented as sets of feature vectors, by matching the estimates of the example-specific densities computed at every feature vector in those two examples. In this work, the number of feature vectors of an example among the K nearest neighbors of a feature vector is considered as an estimate of the example-specific density. The minimum of the estimates of two example-specific densities, one for each example, at a feature vector is considered as the matching score. The ESDMK is then computed as the sum of the matching score computed at every feature vector in a pair of examples. We study the performance of the support vector machine (SVM) based classifiers using the proposed ESDMK for speech emotion recognition and speaker identification tasks and compare the same with that of the SVM-based classifiers using the state-of-the-art kernels for varying length patterns.


Feature Vector Speech Signal Speaker Recognition Speaker Identification Fisher Kernel 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Pearson Education, New Jersey (2003)zbMATHGoogle Scholar
  2. 2.
    Reynolds, D.A.: Speaker identification and verification using Gaussian mixture speaker models. Speech Commun. 17, 91–108 (1995)CrossRefGoogle Scholar
  3. 3.
    Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digit. Signal Proc. 10(1–3), 19–41 (2000)CrossRefGoogle Scholar
  4. 4.
    Dileep, A.D., Chandra Sekhar, C.: GMM-based intermediate matching kernel for classification of varying length patterns of long duration speech using support vector machines. IEEE Trans. Neural Netw. Learn. Syst. 25(8), 1421–1432 (2014)CrossRefGoogle Scholar
  5. 5.
    Smith, N., Gales, M., Niranjan, M.: Data-dependent kernels in SVM classification of speech patterns. Technical Report CUED/F-INFENG/TR.387, Engineering Department, Cambridge University, Cambridge, April 2001Google Scholar
  6. 6.
    Lee, K.-A., You, C.H., Li, H., Kinnunen, T.: A GMM-based probabilistic sequence kernel for speaker verification. In: Proceedings of INTERSPEECH, Antwerp, Belgium, pp. 294–297, August 2007Google Scholar
  7. 7.
    Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Signal Process. Lett. 13(5), 308–311 (2006)CrossRefGoogle Scholar
  8. 8.
    You, C.H., Lee, K.A., Li, H.: An SVM kernel with GMM-supervector based on the Bhattacharyya distance for speaker recognition. IEEE Signal Process. Lett. 16(1), 49–52 (2009)CrossRefGoogle Scholar
  9. 9.
    Dileep, A.D., Sekhar Chandra, C.: Speaker recognition using pyramid match kernel based support vector machines. Int. J. Speech Technol. 15(3), 365–379 (2012)CrossRefGoogle Scholar
  10. 10.
    Jaakkola, T., Diekhans, M., Haussler, D.: A discriminative framework for detecting remote protein homologies. J. Comput. Biol. 7(1–2), 95–114 (2000)CrossRefGoogle Scholar
  11. 11.
    Burkhardt, F., Paeschke, A., Rolfes, M., Weiss, W.S.B.: A database of German emotional speech. In: Proceedings of INTERSPEECH, Lisbon, Portugal, pp. 1517–1520, September 2005Google Scholar
  12. 12.
    Steidl, S.: Automatic classification of emotion-related user states in spontaneous childern’s speech. Ph.D. Thesis, Der Technischen Fakultät der Universität Erlangen-Nürnberg, Germany (2009)Google Scholar
  13. 13.
    The NIST year 2002 speaker recognition evaluation plan (2002).
  14. 14.
    The NIST year 2003 speaker recognition evaluation plan (2003).
  15. 15.
    Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011). CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Abhijeet Sachdev
    • 1
  • A. D. Dileep
    • 1
    Email author
  • Veena Thenkanidiyoor
    • 2
  1. 1.School of Computing and Electrical EngineeringIndian Institute of Technology MandiMandiIndia
  2. 2.Department of Computer Science and EngineeringNational Institute of Technology GoaPondaIndia

Personalised recommendations