Speech Recognition Based on Open Source Speech Processing Software

Kłosowski, Piotr; Dustor, Adam; Izydorczyk, Jacek; Kotas, Jan; Ślimok, Jacek

doi:10.1007/978-3-319-07941-7_31

Piotr Kłosowski¹⁵,
Adam Dustor¹⁵,
Jacek Izydorczyk¹⁵,
Jan Kotas¹⁵ &
…
Jacek Ślimok¹⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 431))

Included in the following conference series:

International Conference on Computer Networks

1157 Accesses
5 Citations

Abstract

Creating of speech recognition application requires advanced speech processing techniques realized by specialized speech processing software. It is very possible to improve the speech recognition research by using frameworks based on open source speech processing software. The article presents the possibility of using open source speech processing software to construct own speech recognition application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kłosowski, P.: Speech Processing Application Based on Phonetics and Phonology of the Polish Language. In: Kwiecień, A., Gaj, P., Stera, P. (eds.) CN 2010. CCIS, vol. 79, pp. 236–244. Springer, Heidelberg (2010)
Chapter Google Scholar
Kłosowski, P., Dustor, A.: Automatic Speech Segmentation for Automatic Speech Translation. In: Kwiecień, A., Gaj, P., Stera, P. (eds.) CN 2013. CCIS, vol. 370, pp. 466–475. Springer, Heidelberg (2013)
Chapter Google Scholar
Dustor, A., Kłosowski, P.: Biometric Voice Identification Based on Fuzzy Kernel Classifier. In: Kwiecień, A., Gaj, P., Stera, P. (eds.) CN 2013. CCIS, vol. 370, pp. 456–465. Springer, Heidelberg (2013)
Chapter Google Scholar
Kłosowski, P.: Improving Speech Processing Based on Phonetics and Phonology of Polish Language. Przeglad Elektrotechniczny R 89(8), 303–307. Sigma-Not (2013)
Google Scholar
Rabiner, L.R., Schafer, R.W.: Introduction to Digital Speech Processing. Foundations and Trends in Signal Processing 1(1-2), 1–194 (2007)
Article Google Scholar
Tsontzos, G., Orglmeister, R.: CMU Sphinx4 speech recognizer in a Service-oriented Computing style. In: IEEE International Conference on Service-Oriented Computing and Applications (SOCA), pp. 1–4 (2011)
Google Scholar
Bilmes, J., Bartels, C.: Graphical model architectures for speech recognition. IEEE Signal Processing Magazine 22(5), 89–100 (2005)
Article Google Scholar
Bilmes, J., Zweig, G.: The graphical models toolkit: An open source software system for speech and time-series processing. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), p. IV-3916–IV-3919 (2002)
Google Scholar
Pellom, B.: SONIC: The University of Colorado Continuous Speech Recognizer. University of Colorado, Colorado (2001)
Google Scholar
Pellom, B., Hacioglu, K.: Recent Improvements in the CU SONIC ASR System for Noisy Speech: The SPINE Task. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hong Kong (April 2003)
Google Scholar
Young, S., Evermann, G., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book. Cambridge University Engineering Department, Cambridge (2002)
Google Scholar
Bonastre, J.F., Wils, F., Meignier, S.: ALIZE, a free toolkit for speaker recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), vol. 1, pp. 737–740 (2005)
Google Scholar
Stevens, S.S., Volkman, J.: The relation of pitch to frequency. American Journal of Psychology 53, 329 (1940)
Article Google Scholar
Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L., Zue, V.: TIMIT Acoustic-Phonetic Continuous Speech Corpus. Linguistic Data Consortium, Philadelphia (1993)
Google Scholar
Hermansky, H.: Perceptual linear predictive (plp) analysis of speech. Journal of the Acoustical Society of America 87(4) (1990)
Google Scholar
Ziółko, B., Manandhar, S., Wilson, R.C., Ziółko, M., Gałka, J.: Application of HTK to the Polish language. In: International Conference on Audio, Language and Image Processing, ICALIP 2008, pp. 1759–1764 (2008)
Google Scholar
Fauve, B.G.B., Matrouf, D., Scheffer, N., Bonastre, J.F., Mason, J.S.D.: State-of-the-Art Performance in Text-Independent Speaker Verification Through Open-Source Software. IEEE Transactions on Audio, Speech, and Language Processing 15(7), 1960–1968 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Electronics, Silesian University of Technology, Akademicka Str. 16, 44-100, Gliwice, Poland
Piotr Kłosowski, Adam Dustor, Jacek Izydorczyk, Jan Kotas & Jacek Ślimok

Authors

Piotr Kłosowski
View author publications
You can also search for this author in PubMed Google Scholar
Adam Dustor
View author publications
You can also search for this author in PubMed Google Scholar
Jacek Izydorczyk
View author publications
You can also search for this author in PubMed Google Scholar
Jan Kotas
View author publications
You can also search for this author in PubMed Google Scholar
Jacek Ślimok
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Science, Silesian University of Technology, Akademicka 16, 44-100, Gliwice, Poland
Andrzej Kwiecień
Institute of Informatics, Silesian University of Technology, ul. Akademicka 16, 44-100, Gliwice, Poland
Piotr Gaj
Institute of Informatics, Silesian University of Technology, Gliwice, Poland
Piotr Stera

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kłosowski, P., Dustor, A., Izydorczyk, J., Kotas, J., Ślimok, J. (2014). Speech Recognition Based on Open Source Speech Processing Software. In: Kwiecień, A., Gaj, P., Stera, P. (eds) Computer Networks. CN 2014. Communications in Computer and Information Science, vol 431. Springer, Cham. https://doi.org/10.1007/978-3-319-07941-7_31

Download citation

DOI: https://doi.org/10.1007/978-3-319-07941-7_31
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07940-0
Online ISBN: 978-3-319-07941-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics