Advertisement

Automatic Classification of Types of Artefacts Arising During the Unit Selection Speech Synthesis

  • Jiří PřibilEmail author
  • Anna Přibilová
  • Jindřich Matoušek
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10415)

Abstract

The paper describes an experiment with automatic classification of the basic types of artefacts in the synthetic speech produced by the Czech text-to-speech system using the unit selection synthesis method. The developed classifier based on the Gaussian mixture models (GMM) is solved finally as the open-set classification task due to a limited database of speech artefacts resulting from incorrectly chosen or exchanged speech units during the synthesis process. The realized experiments prove principal impact of the accuracy of determination of the speech artefact section on the final precision of the artefact type classification. From the auxiliary investigations follows a relatively great influence of the number of mixtures and the type of a covariance matrix on the output artefact classification error rate as well as on the computational complexity.

Keywords

Quality of synthetic speech Text-to-speech system GMM classification Statistical analysis 

References

  1. 1.
    Tiomkin, S., Malah, D., Shechtman, S., Kons, Z.: A hybrid text-to-speech system that combines concatenative and statistical synthesis units. IEEE Trans. Audio Speech Lang. Proces. 19(5), 1278–1288 (2011)CrossRefGoogle Scholar
  2. 2.
    Legát, M., Matoušek, J.: Identifying concatenation discontinuities by hierarchical divisive clustering of pitch contours. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 171–178. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-23538-2_22 CrossRefGoogle Scholar
  3. 3.
    Bello, C., Ribas, D., Calvo, J.R., Ferrer, C.A.: From speech quality measures to speaker recognition performance. In: Bayro-Corrochano, E., Hancock, E. (eds.) CIARP 2014. LNCS, vol. 8827, pp. 199–206. Springer, Cham (2014). doi: 10.1007/978-3-319-12568-8_25 Google Scholar
  4. 4.
    Bapat, O.A., Fastow, R.M., Olson, J.: Acoustic coprocessor for HMM based embedded speech recognition systems. IEEE Trans. Consum. Electron. 59(3), 629–633 (2013)CrossRefGoogle Scholar
  5. 5.
    Campbell, W.M., Campbell, J.P., Reynolds, D.A., Singer, E., Torres-Carrasquillo, P.A.: Support vector machines for speaker and language recognition. Comput. Speech Lang. 20(2–3), 210–229 (2006)CrossRefGoogle Scholar
  6. 6.
    Matza, A., Bistritz, Y.: Skew Gaussian mixture models for speaker recognition. IET Sign. Process. 8(8), 860–867 (2014)CrossRefGoogle Scholar
  7. 7.
    Dileep, A.D., Sekhar, C.C.: Class-specific GMM based intermediate matching kernel for classification of varying length patterns of long duration speech using support vector machines. Speech Commun. 57, 126–143 (2014)CrossRefGoogle Scholar
  8. 8.
    Tihelka, D., Kala, J., Matoušek, J.: Enhancements of Viterbi search for fast unit selection synthesis. In: Proceedings of Interspeech 2010, Makuhari, Japan, pp. 174–177 (2010)Google Scholar
  9. 9.
    Přibil, J., Přibilová, A., Matoušek, J.: Detection of artefacts in Czech synthetic speech based on ANOVA statistics. In: Proceedings of the 37th International Conference on Telecommunications and Signal Processing, TSP 2014, Berlin, Germany, pp. 414–418 (2014)Google Scholar
  10. 10.
    Přibil, J., Přibilová, A., Matoušek, J.: Experiment with GMM-based artefact localization in Czech synthetic speech. In: Král, P., Matoušek, V. (eds.) TSD 2015. LNCS, vol. 9302, pp. 23–31. Springer, Cham (2015). doi: 10.1007/978-3-319-24033-6_3 CrossRefGoogle Scholar
  11. 11.
    Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3, 72–83 (1995)CrossRefGoogle Scholar
  12. 12.
    Přibil, J., Přibilová, A.: Evaluation of influence of spectral and prosodic features on GMM classification of Czech and Slovak emotional speech. EURASIP J. Audio Speech Music Process. 2013(8), 1–22 (2013)Google Scholar
  13. 13.
    Nabney, I.T.: Netlab pattern analysis toolbox, Release 3.3. http://www.aston.ac.uk/eas/research/groups/ncrg/resources/netlab/downloads. Accessed 15 Oct 2015

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Jiří Přibil
    • 1
    • 2
    Email author
  • Anna Přibilová
    • 3
  • Jindřich Matoušek
    • 1
  1. 1.Department of Cybernetics, Faculty of Applied SciencesUniversity of West BohemiaPlzeňCzech Republic
  2. 2.Institute of Measurement ScienceSlovak Academy of SciencesBratislavaSlovakia
  3. 3.Faculty of Electrical Engineering and Information Technology, Institute of Electronics and PhotonicsSlovak University of Technology in BratislavaBratislavaSlovak Republic

Personalised recommendations