Skip to main content

The Algorithms of Automation of the Process of Creating Acoustic Units Databases in the Polish Speech Synthesis

  • Conference paper
  • First Online:
Novel Developments in Uncertainty Representation and Processing

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 401))

Abstract

This paper presents the new approach of creating the database of acoustic units in concatenative TTS synthesis. Nowadays databases like this are created manually, which is very time-consuming and takes at least several months of work. Creation such base in automatic way shortens this time to hours. One of the next problem in the concatenative synthesis is the problem of reproduction any text using a voice and a way of speaking of particular man. Presented algorithms allow to create the allophone units database of particular man after receiving a sample of his voice and as a result synthesizer speaking with exactly this voice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dutoit, T.: An Introduction to text-to-speech synthesis. Kluwer Academic Publishers, Dordrecht (1997)

    Google Scholar 

  2. Taylor, P.: Text-to-speech synthesis. Cambridge University Press, Cambridge (2009)

    Google Scholar 

  3. Van Santen, J., Sproat, R., Olive, J., Hirshberg, J.: Progress in speech synthesis. Springer, New York (1997)

    Google Scholar 

  4. Szpilewski, E., Piórkowska, B., Rafałko, J., Lobanov, B., Kiselov, V., Tsirulnik, L.: Polish TTS in multi-voice slavonic languages speech synthesis system. In: SPECOM’2004 Proceedings, 9th International Conference Speech and Computer, pp. 565–570. Saint-Petersburg, Russia (2004)

    Google Scholar 

  5. Jassem, W.: Podstawy fonetyki akustycznej, wyd. PWN, Warszawa (1973)

    Google Scholar 

  6. Lobanov, B., Piórkowska, B., Rafałko, J., Cyrulnik, L.: Peaлизaция мeжъязыкoвыx paзличий интoнaции зaвиepшённocти и нeзaвиepшённocти в cинтeзaтope pyccкoй и пoлcкoй peчи пo тeкcтy. In: Computational Linguistics and Intellectual Technologies, International Conference Dialogue’2005 Proceedings, pp. 356–362. Zvenigorod, Russia (2005)

    Google Scholar 

  7. Matoušek, J.: Building a new czech text-to-speech system using triphonebased speech units. In: Text, Speech and Dialog, Proceedings of the 3rd International Workshop TSD’2000, pp. 223–228. Czech Republic, Brno (2000)

    Google Scholar 

  8. Piórkowska, B., Popowski, K., Rafałko, J., Szpilewski, E.: Polish language speech synthesis basis on text information. New trends in audio and video, vol. I, pp. 507–526. Rozprawy Naukowe Nr 134, Białystok (2006)

    Google Scholar 

  9. Skrelin, P.: Allophone-based concatenative speech synthesis system for Russian. In: Text, Speech and Dialog, Proceedings of the 2nd International Workshop TSD’99, pp. 156–159. Czech Republic, Pilsen (1999)

    Google Scholar 

  10. Brachmański, S.: VoIP—ocena jakości transmisji mowy metodą ACR i DCR. Przegląd Telekomunikacyjny i Wiadomości Telekomunikacyjne, nr 8–9, 424–427 (2003)

    Google Scholar 

  11. Janicki, A., Księżak, B., Kijewski, J., Kula, S.: Badanie jakości sygnału mowy w telefonii internetowej z wykorzystaniem zdań nieprzewidywalnych semantycznie. KSTiT 2006, Bydgoszcz (2006)

    Google Scholar 

  12. Trzaskowska, M.J., Mucha, B.: Metody obiektywnej oceny jakości usługi głosowej QoS w sieciach łączności elektronicznej. w Metody obiektywnej oceny jakości usługi głosowej QoS w sieciach łączności elektronicznej oraz urządzenia do takiej oceny i do badania dostępności usług poprzez numery alarmowe—etap 1, załącznik “X”, Instytut Łączności, Państwowy Instytut Badawczy, Warszawa (2006)

    Google Scholar 

  13. ITU-T Recommendation P.800.: Method for subjective determination of transmission quality (1996)

    Google Scholar 

  14. PN-90/T-05100.: Analogowe łańcuchy telefoniczne. Wymagania i metody pomiaru wyrazistości logatomowej. Warszawa (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Janusz Rafałko .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Rafałko, J. (2016). The Algorithms of Automation of the Process of Creating Acoustic Units Databases in the Polish Speech Synthesis. In: Atanassov, K., et al. Novel Developments in Uncertainty Representation and Processing. Advances in Intelligent Systems and Computing, vol 401. Springer, Cham. https://doi.org/10.1007/978-3-319-26211-6_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26211-6_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26210-9

  • Online ISBN: 978-3-319-26211-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics