Advertisement

Dynamic Speech Trajectory Based Parameters for Low Resource Languages

  • Parabattina BhagathEmail author
  • Megha Jain
  • Pradip K. Das
Conference paper
  • 60 Downloads
Part of the Communications in Computer and Information Science book series (CCIS, volume 1241)

Abstract

The speech recognition problem deals with recognizing spoken words or utterances to interpret the voice message. This domain has been investigated by many researchers for more than five decades. There are numerous techniques and frameworks made available to address this problem. Hidden Markov Modeling (HMM) being a popular modeling technique has been used in different tools to build speech-based systems. In spite of its vast usage and popularity, shortcomings have introduced some new challenges in designing the feature modeling techniques. One of the solutions is using trajectory models. They are efficient in capturing the intra-segmental temporal dynamics which helps to understand the continuous nature of the speech signal. Even though trajectories have been found to be an effective solution, the complexity of trajectory modeling is yet to be improved. In this paper, two trajectory parameter extraction methods are proposed. The methods are shown to be effective for speech classification. The detailed procedures with results are discussed in this paper.

Keywords

Fréchet distance Peak attributes Trajectory 

References

  1. 1.
    Bringmann, K.: Why walking the dog takes time: Fréchet distance has no strongly subquadratic algorithms unless SETH fails. In: 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pp. 661–670, October 2014.  https://doi.org/10.1109/FOCS.2014.76
  2. 2.
    CSEIITGUWAHATI: IITG digit database - Google drive, March 2020. https://drive.google.com/drive/folders/1px1p2p5QRNNvFvLJT9hgkA93N7_Utwzs. Accessed 05 Mar 2020
  3. 3.
    CSEIITGUWAHATI: IITG vowel data - Google drive, March 2020. https://drive.google.com/drive/folders/16BcS5cyOdE5oJChj6J8oQp1PmXvU7etV. Accessed 05 Mar 2020
  4. 4.
    Deng, L., Strik, H.: Structure-based and template-based automatic speech recognition - comparing parametric and non-parametric approaches. In: INTERSPEECH 2007, 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, 27–31 August 2007, pp. 898–901 (2007). http://www.isca-speech.org/archive/interspeech_2007/i07_0898.html
  5. 5.
    Deng, L., Yu, D., Acero, A.: Structured speech modeling. IEEE Trans. Audio Speech Lang. Process. 14(5), 1492–1504 (2006)CrossRefGoogle Scholar
  6. 6.
    Driemel, A., Har-Peled, S., Wenk, C.: Approximating the Fréchet distance for realistic curves in near linear time. Discrete Comput. Geom. 48(1), 94–127 (2012).  https://doi.org/10.1007/s00454-012-9402-zMathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Eiter, T., Mannila, H.: Computing discrete Fréchet distance. Technical report, Citeseer (1994)Google Scholar
  8. 8.
    Firouzmand, M.Z., Girin, L.: Perceptually weighted long term modeling of sinusoidal speech amplitude trajectories. In: 2005 Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), vol. 1, pp. I/369–I/372, March 2005.  https://doi.org/10.1109/ICASSP.2005.1415127
  9. 9.
    Gish, H., Ng, K.: A segmental speech model with applications to word spotting. In: 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 447–450, April 1993.  https://doi.org/10.1109/ICASSP.1993.319337
  10. 10.
    Hubing, N., Yoo, K.: Exploiting recursive parameter trajectories in speech analysis. In: Proceedings ICASSP 1992: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 125–128, March 1992.  https://doi.org/10.1109/ICASSP.1992.225956
  11. 11.
    Jekel, C.F., Venter, G., Venter, M.P., Stander, N., Haftka, R.T.: Similarity measures for identifying material parameters from hysteresis loops using inverse analysis. Int. J. Mater. Form. (2018).  https://doi.org/10.1007/s12289-018-1421-8
  12. 12.
    Kannan, A., Ostendorf, M.: A comparison of constrained trajectory segment models for large vocabulary speech recognition. IEEE Trans. Speech Audio Process. 6(3), 303–306 (1998).  https://doi.org/10.1109/89.668825CrossRefGoogle Scholar
  13. 13.
    Li, H., Liu, J., Wu, K., Yang, Z., Liu, R.W., Xiong, N.: Spatio-temporal vessel trajectory clustering based on data mapping and density. IEEE Access 6, 58939–58954 (2018).  https://doi.org/10.1109/ACCESS.2018.2866364CrossRefGoogle Scholar
  14. 14.
    Lin, Z., Zeng, Q., Duan, H., Liu, C., Lu, F.: A semantic user distance metric using GPS trajectory data. IEEE Access 7, 30185–30196 (2019).  https://doi.org/10.1109/ACCESS.2019.2896577CrossRefGoogle Scholar
  15. 15.
    Liu, S., Sim, K.C.: Implicit trajectory modelling using temporally varying weight regression for automatic speech recognition. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4761–4764, March 2012.  https://doi.org/10.1109/ICASSP.2012.6288983
  16. 16.
    Liu, S., Sim, K.C.: Temporally varying weight regression: a semi-parametric trajectory model for automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(1), 151–160 (2014).  https://doi.org/10.1109/TASLP.2013.2285487CrossRefGoogle Scholar
  17. 17.
    Liu, Z., Hu, L., Wu, C., Ding, Y., Zhao, J.: A novel trajectory similarity-based approach for location prediction. Int. J. Distrib. Sensor Netw. 12(11), 1550147716678426 (2016).  https://doi.org/10.1177/1550147716678426
  18. 18.
    Manjunath, K.E., Kumar, S.B.S., Pati, D., Satapathy, B., Rao, K.S.: Development of consonant-vowel recognition systems for Indian languages: Bengali and Odia. In: 2013 Annual IEEE India Conference (INDICON), pp. 1–6, December 2013.  https://doi.org/10.1109/INDCON.2013.6726109
  19. 19.
    Minematsu, N.: Mathematical evidence of the acoustic universal structure in speech. In: 2005 Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2005, vol. 1, pp. I/889–I/892, March 2005.  https://doi.org/10.1109/ICASSP.2005.1415257
  20. 20.
    Russell, M.J., Holmes, W.J.: Linear trajectory segmental HMMs. IEEE Signal Process. Lett. 4(3), 72–74 (1997).  https://doi.org/10.1109/97.558642CrossRefGoogle Scholar
  21. 21.
    Wu, Z., King, S.: Improving trajectory modelling for DNN-based speech synthesis by using stacked bottleneck features and minimum generation error training. IEEE/ACM Trans. Audio Speech Lang. Process. 24(7), 1255–1265 (2016).  https://doi.org/10.1109/TASLP.2016.2551865CrossRefGoogle Scholar
  22. 22.
    Xiao, P., Ang, M., Jiawei, Z., Lei, W.: Approximate similarity measurements on multi-attributes trajectories data. IEEE Access 7, 10905–10915 (2019).  https://doi.org/10.1109/ACCESS.2018.2889475CrossRefGoogle Scholar
  23. 23.
    Han, Y., de Veth, J., Boves, L.: Trajectory clustering for automatic speech recognition. In: 2005 13th European Signal Processing Conference, pp. 1–4, September 2005Google Scholar
  24. 24.
    Yun, Y.-S., Oh, Y.-H.: A segmental-feature HMM using parametric trajectory model. In: 2000 Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, (Cat. No. 00CH37100), vol. 3, pp. 1249–1252, June 2000.  https://doi.org/10.1109/ICASSP.2000.861802
  25. 25.
    Zhao, B., Schultz, T.: Toward robust parametric trajectory segmental model for vowel recognition. In: 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. IV-4165–IV-4165, May 2002.  https://doi.org/10.1109/ICASSP.2002.5745596

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Department of CSEIndian Institute of Technology GuwahatiGuwahatiIndia

Personalised recommendations