Research on Acceleration Method of Speech Recognition Training

  • Liang Bai
  • Jingfei JiangEmail author
  • Yong Dou
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 908)


Recurrent Neural Network (RNN) is now widely used in speech recognition. Experiments show that it has significant advantages over traditional methods, but complex computation limits its application, especially in real-time application scenarios. Recurrent neural network is heavily dependent on the pre- and post-state in calculation process, and there is much overlap information, so overlapping information can be reduced to accelerate training. This paper construct a training acceleration structure, which reduces the computation cost and accelerates training speed by discarding the dependence of pre- and post- state of RNN. Then correcting the recognition results errors with text corrector. We verify the proposed method on the TIMIT and Librispeech datasets, which prove that this approach achieves about 3 times speedup with little relative accuracy reduction.


Speech recognition Accelerating training Text correction 


  1. 1.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016). Book in PreparationzbMATHGoogle Scholar
  2. 2.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  3. 3.
    Yao, K., Cohn, T., Vylomova, K., Duh, K., Dyer, C.: Depth-gated recurrent neural networks. arXiv preprint arXiv: 1508.03790v2 (2015)Google Scholar
  4. 4.
    Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: International Conference, pp. 2342–2350 (2015)Google Scholar
  5. 5.
    Chung, J., Gulcehre, C., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv: 1412.35556 (2014)Google Scholar
  6. 6.
    Vaswani, A., Shazeer, N., Parmar, N., Polosukhin, I.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)
  7. 7.
    Wu, Y., Schuster, M., Chen, Z.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
  8. 8.
    Sak, H., Senior, A., Françoise, F.: Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: INTERSPEECH (2014)Google Scholar
  9. 9.
    Frank, S., Li, G., Yu, D.: Conversational speech transcription using context-dependent deep neural networks. In: 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), pp. 437–440, Florence, Italy (2011)Google Scholar
  10. 10.
    Michael, S., Yu, D., Wang, Y.: An investigation of deep neural networks for noise robust speech recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver, Canada, pp. 7398–7402 (2013)Google Scholar
  11. 11.
    Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: International Conference on Machine Learning, pp. 1764–1772 (2014)Google Scholar
  12. 12.
    Lei, T., Zhang, Y., Artzi, Y.: Training RNNs as fast as CNNs. arXiv preprint arXiv:1709.02755 (2017)
  13. 13.
    Graves, A., Gomez, F.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: International Conference on Machine Learning, pp. 369–376 (2016)Google Scholar
  14. 14.
    Zhang, Y., He, P.L., Xiang, W., Li, M.: A discriminative reranking approach to spelling correction. J. Softw. 19(3), 557–564 (2008)CrossRefGoogle Scholar
  15. 15.
    Toutanova, K., Moore, R.C.: Pronunciation modeling for improved spelling correction. In: Proceedings of Annual Meeting of the Association for Computational Linguistics, pp. 144–151 (2002)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.National University of Defense TechnologyChangshaChina

Personalised recommendations