Reducing Computational and Memory Cost for HMM-Based Embedded TTS System

Fu, Rong; Zhao, Zengliang; Tu, Qixiong

doi:10.1007/978-3-642-23214-5_78

Rong Fu²,
Zengliang Zhao² &
Qixiong Tu³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 224))

Included in the following conference series:

International Conference on Applied Informatics and Communication

1793 Accesses
1 Citations

Abstract

In this paper, we present several methods to reduce the computational and memory cost to embed HMM-based TTS system. We firstly decrease the number of HMMs by applying decision tree based context clustering technique. Secondly propose address-based model compression technique to compress the model size without degradation in synthesis speech quality. Thirdly reduce the feature vector size to decrease computational and memory resources. Finally, fixed-point implementation is taken to fit the TTS system requirements to embedded devices’ resource. Experimental results show that the system size can be compressed to 3.61MB from 293MB, memory and computational cost are low enough for real-time embedded application. Subjective evaluation shows that the synthesis speech quality is fairly good.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A small-footprint context-independent HMM-based synthesizer for Tamil

Article 03 April 2015

Optimal Number of States in HMM-Based Speech Synthesis

Hybrid Source Modeling Method Utilizing Optimal Residual Frames for HMM-based Speech Synthesis

References

Levy, C., Linares, G., Nocera, P., Bonastre, J.-F.: Reducing computational and memory cost for cellular phone embedded speech recognition system. In: Proc. ICASSP (2004)
Google Scholar
Tokuda, K., Masuko, T., Miyazaki, N., Kobayashi, T.: Hidden Markov Models Based on Multi-Space Probability Distribution for Pitch Pattern Modeling. In: Proc. ICASSP (1999)
Google Scholar
Zen, H., Tokuda, K., Masuko, T., Kobayashi, T., Kitamura, T.: Hidden Semi-Markov Model Based Speech Synthesi. In: Proc. ICSLP, pp. 1180–1185 (2004)
Google Scholar
Tokuda, K., Kobayashi, T.: S. Imai, “Speech parameter generation from HMM using dynamic features. In: Proc. ICASSP, pp. 660–663 (1995)
Google Scholar
Shinoda, K., Watanabe, T.: Acoustic Modeling Based on the MDL Principle for speech recognition. In: Proc. EuroSpeech, pp. 99–102 (1997)
Google Scholar
Wakita, H.: Linear prediction voice synthesizers: line spectrum pair (LSP) is the newest of several techniques. Speech Technol., 17–22 (1981)
Google Scholar

Download references

Author information

Authors and Affiliations

Beijing Institute of Applied Meteorology, Beijing, 100029, China
Rong Fu & Zengliang Zhao
Department of Electronic and Engineering, Tsinghua University, Beijing, 100084, China
Qixiong Tu

Authors

Rong Fu
View author publications
You can also search for this author in PubMed Google Scholar
Zengliang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Qixiong Tu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Shenzhen University, Nanhai Ave. 3688, 518060, Shenzhen, Guangdong, China
Dehuai Zeng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fu, R., Zhao, Z., Tu, Q. (2011). Reducing Computational and Memory Cost for HMM-Based Embedded TTS System. In: Zeng, D. (eds) Applied Informatics and Communication. ICAIC 2011. Communications in Computer and Information Science, vol 224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23214-5_78

Download citation

DOI: https://doi.org/10.1007/978-3-642-23214-5_78
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23213-8
Online ISBN: 978-3-642-23214-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Reducing Computational and Memory Cost for HMM-Based Embedded TTS System

Abstract

Access this chapter

Preview

Similar content being viewed by others

A small-footprint context-independent HMM-based synthesizer for Tamil

Optimal Number of States in HMM-Based Speech Synthesis

Hybrid Source Modeling Method Utilizing Optimal Residual Frames for HMM-based Speech Synthesis

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Reducing Computational and Memory Cost for HMM-Based Embedded TTS System

Abstract

Access this chapter

Preview

Similar content being viewed by others

A small-footprint context-independent HMM-based synthesizer for Tamil

Optimal Number of States in HMM-Based Speech Synthesis

Hybrid Source Modeling Method Utilizing Optimal Residual Frames for HMM-based Speech Synthesis

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation