Skip to main content

Endangered Tujia Language Speech Enhancement Research Based on Improved DCGAN

  • Conference paper
  • First Online:
Chinese Computational Linguistics (CCL 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11856))

Included in the following conference series:

Abstract

As an endangered language, Tujia language only rely on oral communication. There must exist noises in the process of collecting Tujia language corpus. This paper studies an end-to-end speech enhancement model based on improved deep convolutional generative adversarial network (DCGAN) to extract nearly pure Tujia language speech in noisy environment. Due to the low resource nature of Tujia language, using Chinese corpus as an extension of the Tujia language can effectively solve the problem of insufficient data. The speech enhancement function of the Tujia language was realized using the end-to-end method that consists of symmetric encoding and decoding. By modifying the loss function and network hierarchy parameters, adding the spectrum normalization and imbalanced learning rate made the model more stable during the training process. The experimental results show that the speech enhancement method proposed in this paper can achieve better noise reduction effect on the Tujia language dataset than traditional speech enhancement algorithm and neural network enhancement algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://tla.mpi.nl/tools/tla-tools/elan/.

  2. 2.

    http://sox.sourceforge.net/.

  3. 3.

    https://www.itu.int/rec/T-REC-P.862/en.

References

  1. Shixuan, X.: On the recording and preservation of endangered language data. J. Guangxi Univ. Natl. (Philos. Soc. Sci. Ed.) 28(5), 11–15 (2006)

    Google Scholar 

  2. Hang, H.: Modern Speech Signal Processing, pp. 351–352. Electronic Industry Press, Beijing (2014)

    Google Scholar 

  3. Dailong, X., Guanyu, L., Ning, M.: Speech enhancement research based on spectral subtraction. J. Northwest University (Nat. Sci.) 38(02), 21–25, 87 (2017)

    Google Scholar 

  4. Navneet, U., Rahul, K.: Single channel speech enhancement: using wiener filtering with recursive noise estimation. Procedia Comput. Sci. 84, 22–30 (2016)

    Article  Google Scholar 

  5. Chengli, S., Jianxiao, X., Yan, L.: A signal subspace speech enhancement approach based on joint low-rank and sparse matrix decomposition. Arch. Acoust. 41(2), 245–254 (2016)

    Article  Google Scholar 

  6. Tamura, S., Waibel, A.: Noise reduction using connectionist models. ICASSP 1988(1), 553–556 (1988)

    Google Scholar 

  7. Yong, X., Jun, D., Lirong, D., et al.: An experimental study on speech enhancement based on deep neural networks. IEEE Signal Process. Lett. 21(1), 65–68 (2014)

    Article  Google Scholar 

  8. Shi, W., Zhang, X., Sun, M., et al.: Deep neural network based monaural speech enhancement with sparse and low-rank decomposition. In: IEEE 17th International Conference on Communication Technology (ICCT), pp. 1644–1647 (2017)

    Google Scholar 

  9. Huang, Q., Bao, C., Wang, X., et al.: DNN-based speech enhancement using MBE model. In: IWAENC, pp. 196–200 (2018)

    Google Scholar 

  10. Goodfellow, I., Pouget-Abadie, M., Mirza, B., et al.: Generative adversarial nets. In Advances in Neural Information Processing Systems (NIPS), pp. 2672–2680 (2014)

    Google Scholar 

  11. He, H., Philip S, Y., Changhu, W.: An introduction to image synthesis with generative adversarial nets. arXiv:1803.04469 (2018)

  12. Jiaxian, G., Sidi, L., Han, C., et al.: Long text generation via adversarial training with leaked information. arXiv:1709.08624 (2017)

  13. Engel, J., Agrawal, K.K., Chen, S., et al.: GANSynth: adversarial neural audio synthesis. In: ICLR (2019)

    Google Scholar 

  14. Pascual, S., Bonafonte, A., Serra, J.: SEGAN: speech enhancement generative adversarial network. In: INTERSPEECH (2017)

    Google Scholar 

  15. Alec, R., Luke, M.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: ICLR (2016)

    Google Scholar 

  16. Xiaojiao, M., Chunhua, S., Yubin, Y.: Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. arXiv:1603.09056 (2016)

  17. Takeru, M., Toshiki, K., Masanori, K., et al.: Spectral normalization for generative adversarial networks. In: ICLR (2018)

    Google Scholar 

  18. Zhang, H., Goodfellow, I., Metaxas, D., et al.: Self-attention generative adversarial networks. arXiv:1805.08318 (2018)

  19. Heusel, M., Ramsauer, H., Unterthiner, T., et al.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. arXiv:1706.08500 (2018)

  20. Dong, W., Xuewei, Z.: THCHS-30: a free Chinese speech corpus. Comput. Sci. (2015)

    Google Scholar 

Download references

Acknowledgment

This research is supported by Ministry of Education Humanities and Social Sciences Research Planning Fund Project, grant number 16YJAZH072, and Major projects of the National Social Science Fund, grant number 14ZDB156.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chongchong Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yu, C., Kang, M., Chen, Y., Li, M., Dai, T. (2019). Endangered Tujia Language Speech Enhancement Research Based on Improved DCGAN. In: Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics. CCL 2019. Lecture Notes in Computer Science(), vol 11856. Springer, Cham. https://doi.org/10.1007/978-3-030-32381-3_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32381-3_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32380-6

  • Online ISBN: 978-3-030-32381-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics