Skip to main content

Very Low Bit Rate Speech Coding: Comparison of Data-Driven Units with Syllable Segments

  • Conference paper
  • First Online:
Text, Speech and Dialogue (TSD 1999)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1692))

Included in the following conference series:

Abstract

Very low bit-rate (VLBR) coding of speech offers the opportunity to test methods of automatic generation of sub-word units. This paper describes two approaches to VLBR coding: the first based on ALISP (Automatic Language Independent Speech Processing) techniques, the second based on syllable segments. Experimental results are reported on a database of one Czech professional speaker. The obtained rates for unit encoding were approximately 135 bps for the former approach and 62 bps for the latter. The quality was evaluated by measuring the logarithmic spectral distortion (computed on LPC-spectra), and in informal listening tests. Possible mutual profits of each technique to the other are discussed.

The research has been partially supported by the Ministry of Education, Youth and Sports of the Czech Republic, project Nbs. VS97060, VS97028, and by the Grant Agency of the Czech Republic under the Grant 201/99/1248.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. F. Bimbot. Anévaluation of temporal decomposition, Technical report, Acoustic research department AT&T Bell Labs, 1990.

    Google Scholar 

  2. S. Deligne. Modèles de séquences de longueurs variables: Application au traitement du langage écritét de la parole, PhD thesis, École nationale supérieure des télécommunications (ENST), Paris, 1996.

    Google Scholar 

  3. G. Doddington. Syllable Based Speech Processing. Technical report, J. Hopkins University, 1997, WS97 Project Report, Research Notes No. 30.

    Google Scholar 

  4. S. Greenberg. Speaking in Shorthand — A Syllable-Centric Perspective for Understanding Pronunciation Variation. In Proc. Workshop Modeling Pronunciation Variation for Automatic Speech Recognition, pages 47–56, 1998.

    Google Scholar 

  5. L. Josifovski, D. Mihajlov, and D. Gorgevik. Speech Synthesizer Based on Time Domain Syllable Concatenation. In Proc. SPECOM’97, pages 165–170, Cluj-Napoca, 1997.

    Google Scholar 

  6. I. Kopeček. Syllable Based Speech Synthesis. In Proc. 2nd International Workshop SPECOM’97, pages 161–165, Cluj-Napoca, 1997.

    Google Scholar 

  7. I. Kopeček. Speech Synthesis Based on the Composed Syllable Segments. In Proc. of Workshop on Text Speech and Dialogue (TSD’98), pages 259–262, Brno, Czech Republic, September 1998.

    Google Scholar 

  8. I. Kopeček. Syllable Segments in Czech. In Proc. XXVII. Mezhvuzovskoy naucznoy konferencii, pages 60–64, St. Petersburg, March 1998, Vypusk 10.

    Google Scholar 

  9. I. Kopeček and K. Pala. Prosody modeling for sylable-based speech synthesis. In Proc. IASTED Conference on AI and Soft Computing, pages 134–137, 1998.

    Google Scholar 

  10. J. Cernocký, G. Baudoin, D. Petrovska-Delacrétaz, J. Hennebert and G. Chollet. Automatically derived speech units: applications to very low rate coding and speaker verification. In Proc. of Workshop on Text Speech and Dialogue (TSD’98), pages 183–188, Brno, Czech Republic, September 1998.

    Google Scholar 

  11. J. Cernocký, G. Baudoin and G. Chollet. Segmental vocoder — going beyond the phonetic approach. In Proc. IEEE ICASSP 98, pages 605–608, Seattle, WA, May 1998, http://www.fee.vutbr.cz/~{}cernocky/Icassp98.html.

  12. J. Cernocký. Speech Processing Using Automatically Derived Segmental Units: Applications to Very Low Rate Coding and Speaker Verification. PhD thesis, Université Paris XI Orsay, 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Černocký, J., Kopeček, I., Baudoin, G., Chollet, G. (1999). Very Low Bit Rate Speech Coding: Comparison of Data-Driven Units with Syllable Segments. In: Matousek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds) Text, Speech and Dialogue. TSD 1999. Lecture Notes in Computer Science(), vol 1692. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48239-3_48

Download citation

  • DOI: https://doi.org/10.1007/3-540-48239-3_48

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66494-9

  • Online ISBN: 978-3-540-48239-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics