Abstract
Very low bit-rate (VLBR) coding of speech offers the opportunity to test methods of automatic generation of sub-word units. This paper describes two approaches to VLBR coding: the first based on ALISP (Automatic Language Independent Speech Processing) techniques, the second based on syllable segments. Experimental results are reported on a database of one Czech professional speaker. The obtained rates for unit encoding were approximately 135 bps for the former approach and 62 bps for the latter. The quality was evaluated by measuring the logarithmic spectral distortion (computed on LPC-spectra), and in informal listening tests. Possible mutual profits of each technique to the other are discussed.
The research has been partially supported by the Ministry of Education, Youth and Sports of the Czech Republic, project Nbs. VS97060, VS97028, and by the Grant Agency of the Czech Republic under the Grant 201/99/1248.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
F. Bimbot. Anévaluation of temporal decomposition, Technical report, Acoustic research department AT&T Bell Labs, 1990.
S. Deligne. Modèles de séquences de longueurs variables: Application au traitement du langage écritét de la parole, PhD thesis, École nationale supérieure des télécommunications (ENST), Paris, 1996.
G. Doddington. Syllable Based Speech Processing. Technical report, J. Hopkins University, 1997, WS97 Project Report, Research Notes No. 30.
S. Greenberg. Speaking in Shorthand — A Syllable-Centric Perspective for Understanding Pronunciation Variation. In Proc. Workshop Modeling Pronunciation Variation for Automatic Speech Recognition, pages 47–56, 1998.
L. Josifovski, D. Mihajlov, and D. Gorgevik. Speech Synthesizer Based on Time Domain Syllable Concatenation. In Proc. SPECOM’97, pages 165–170, Cluj-Napoca, 1997.
I. Kopeček. Syllable Based Speech Synthesis. In Proc. 2nd International Workshop SPECOM’97, pages 161–165, Cluj-Napoca, 1997.
I. Kopeček. Speech Synthesis Based on the Composed Syllable Segments. In Proc. of Workshop on Text Speech and Dialogue (TSD’98), pages 259–262, Brno, Czech Republic, September 1998.
I. Kopeček. Syllable Segments in Czech. In Proc. XXVII. Mezhvuzovskoy naucznoy konferencii, pages 60–64, St. Petersburg, March 1998, Vypusk 10.
I. Kopeček and K. Pala. Prosody modeling for sylable-based speech synthesis. In Proc. IASTED Conference on AI and Soft Computing, pages 134–137, 1998.
J. Cernocký, G. Baudoin, D. Petrovska-Delacrétaz, J. Hennebert and G. Chollet. Automatically derived speech units: applications to very low rate coding and speaker verification. In Proc. of Workshop on Text Speech and Dialogue (TSD’98), pages 183–188, Brno, Czech Republic, September 1998.
J. Cernocký, G. Baudoin and G. Chollet. Segmental vocoder — going beyond the phonetic approach. In Proc. IEEE ICASSP 98, pages 605–608, Seattle, WA, May 1998, http://www.fee.vutbr.cz/~{}cernocky/Icassp98.html.
J. Cernocký. Speech Processing Using Automatically Derived Segmental Units: Applications to Very Low Rate Coding and Speaker Verification. PhD thesis, Université Paris XI Orsay, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Černocký, J., Kopeček, I., Baudoin, G., Chollet, G. (1999). Very Low Bit Rate Speech Coding: Comparison of Data-Driven Units with Syllable Segments. In: Matousek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds) Text, Speech and Dialogue. TSD 1999. Lecture Notes in Computer Science(), vol 1692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48239-3_48
Download citation
DOI: https://doi.org/10.1007/3-540-48239-3_48
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66494-9
Online ISBN: 978-3-540-48239-0
eBook Packages: Springer Book Archive