Endangered Tujia Language Speech Enhancement Research Based on Improved DCGAN

Yu, Chongchong; Kang, Meng; Chen, Yunbing; Li, Mengxiong; Dai, Tong

doi:10.1007/978-3-030-32381-3_32

Chongchong Yu¹³,
Meng Kang¹³,
Yunbing Chen¹³,
Mengxiong Li¹³ &
…
Tong Dai¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11856))

Included in the following conference series:

China National Conference on Chinese Computational Linguistics

4209 Accesses
2 Citations

Abstract

As an endangered language, Tujia language only rely on oral communication. There must exist noises in the process of collecting Tujia language corpus. This paper studies an end-to-end speech enhancement model based on improved deep convolutional generative adversarial network (DCGAN) to extract nearly pure Tujia language speech in noisy environment. Due to the low resource nature of Tujia language, using Chinese corpus as an extension of the Tujia language can effectively solve the problem of insufficient data. The speech enhancement function of the Tujia language was realized using the end-to-end method that consists of symmetric encoding and decoding. By modifying the loss function and network hierarchy parameters, adding the spectrum normalization and imbalanced learning rate made the model more stable during the training process. The experimental results show that the speech enhancement method proposed in this paper can achieve better noise reduction effect on the Tujia language dataset than traditional speech enhancement algorithm and neural network enhancement algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Shixuan, X.: On the recording and preservation of endangered language data. J. Guangxi Univ. Natl. (Philos. Soc. Sci. Ed.) 28(5), 11–15 (2006)
Google Scholar
Hang, H.: Modern Speech Signal Processing, pp. 351–352. Electronic Industry Press, Beijing (2014)
Google Scholar
Dailong, X., Guanyu, L., Ning, M.: Speech enhancement research based on spectral subtraction. J. Northwest University (Nat. Sci.) 38(02), 21–25, 87 (2017)
Google Scholar
Navneet, U., Rahul, K.: Single channel speech enhancement: using wiener filtering with recursive noise estimation. Procedia Comput. Sci. 84, 22–30 (2016)
Article Google Scholar
Chengli, S., Jianxiao, X., Yan, L.: A signal subspace speech enhancement approach based on joint low-rank and sparse matrix decomposition. Arch. Acoust. 41(2), 245–254 (2016)
Article Google Scholar
Tamura, S., Waibel, A.: Noise reduction using connectionist models. ICASSP 1988(1), 553–556 (1988)
Google Scholar
Yong, X., Jun, D., Lirong, D., et al.: An experimental study on speech enhancement based on deep neural networks. IEEE Signal Process. Lett. 21(1), 65–68 (2014)
Article Google Scholar
Shi, W., Zhang, X., Sun, M., et al.: Deep neural network based monaural speech enhancement with sparse and low-rank decomposition. In: IEEE 17th International Conference on Communication Technology (ICCT), pp. 1644–1647 (2017)
Google Scholar
Huang, Q., Bao, C., Wang, X., et al.: DNN-based speech enhancement using MBE model. In: IWAENC, pp. 196–200 (2018)
Google Scholar
Goodfellow, I., Pouget-Abadie, M., Mirza, B., et al.: Generative adversarial nets. In Advances in Neural Information Processing Systems (NIPS), pp. 2672–2680 (2014)
Google Scholar
He, H., Philip S, Y., Changhu, W.: An introduction to image synthesis with generative adversarial nets. arXiv:1803.04469 (2018)
Jiaxian, G., Sidi, L., Han, C., et al.: Long text generation via adversarial training with leaked information. arXiv:1709.08624 (2017)
Engel, J., Agrawal, K.K., Chen, S., et al.: GANSynth: adversarial neural audio synthesis. In: ICLR (2019)
Google Scholar
Pascual, S., Bonafonte, A., Serra, J.: SEGAN: speech enhancement generative adversarial network. In: INTERSPEECH (2017)
Google Scholar
Alec, R., Luke, M.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: ICLR (2016)
Google Scholar
Xiaojiao, M., Chunhua, S., Yubin, Y.: Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. arXiv:1603.09056 (2016)
Takeru, M., Toshiki, K., Masanori, K., et al.: Spectral normalization for generative adversarial networks. In: ICLR (2018)
Google Scholar
Zhang, H., Goodfellow, I., Metaxas, D., et al.: Self-attention generative adversarial networks. arXiv:1805.08318 (2018)
Heusel, M., Ramsauer, H., Unterthiner, T., et al.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. arXiv:1706.08500 (2018)
Dong, W., Xuewei, Z.: THCHS-30: a free Chinese speech corpus. Comput. Sci. (2015)
Google Scholar

Download references

Acknowledgment

This research is supported by Ministry of Education Humanities and Social Sciences Research Planning Fund Project, grant number 16YJAZH072, and Major projects of the National Social Science Fund, grant number 14ZDB156.

Author information

Authors and Affiliations

College of Computer and Information Engineering, Beijing Technology and Business University, Beijing, 100048, China
Chongchong Yu, Meng Kang, Yunbing Chen, Mengxiong Li & Tong Dai

Authors

Chongchong Yu
View author publications
You can also search for this author in PubMed Google Scholar
Meng Kang
View author publications
You can also search for this author in PubMed Google Scholar
Yunbing Chen
View author publications
You can also search for this author in PubMed Google Scholar
Mengxiong Li
View author publications
You can also search for this author in PubMed Google Scholar
Tong Dai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chongchong Yu .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Maosong Sun
Fudan University, Shanghai, China
Xuanjing Huang
University of Illinois at Urbana Champaign, Illinois, USA
Heng Ji
Tsinghua University, Beijing, China
Zhiyuan Liu
Tsinghua University, Beijing, China
Yang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, C., Kang, M., Chen, Y., Li, M., Dai, T. (2019). Endangered Tujia Language Speech Enhancement Research Based on Improved DCGAN. In: Sun, M., Huang, X., Ji, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics. CCL 2019. Lecture Notes in Computer Science(), vol 11856. Springer, Cham. https://doi.org/10.1007/978-3-030-32381-3_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-32381-3_32
Published: 13 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32380-6
Online ISBN: 978-3-030-32381-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics