Abstract
In this article we present a system for clustering and indexing of automatically recognised radio and television news spoken in Polish language. The aim of the system is to quickly navigate and search for information which is not available in standard internet search engines. The system comprises of speech recognition, alignment and indexing module. The recognition part is trained using dozens of hours of transcribed audio and millions of words representing modern Polish language. The training audio and text is then converted into acoustic and language model, where we apply techniques such as Hidden Markov Models and statistical language processing. The audio is decoded and later submitted into indexing engine which extracts summary information about the spoken topic. The system presents a significant potential in many areas such as media monitoring, university lectures indexing, automated telephone centres and security enhancements.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Clarkson, P.R., Rosenfeld, R.: Statistical language modeling using the CMU-Cambridge toolkit. In: Proceedings of the European Conference on Speech Communication and Technology (1997)
Formey, G.D.: The Viterbi algorithm. Proceedings of the IEEE 61, 268–278 (1973)
Jurafsky, D., Martin, J.H.: Machine translation. In: Ward, N., Jurafsky, D. (eds.) Speech and Language Processing. Prentice-Hall, Englewood Cliffs (2000)
Lee, A., Kawahar, T., Shikano, K.: Julius – an open source real-time large vocabulary recognition engine. In: Proceedings of the European Conference on Speech Communication and Technology, pp. 1691–1694 (2001)
Young, S., et al.: The HTK book (for HTK version 3.4). Cambridge University Engineering Department (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pawlaczyk, L., Bosky, P. (2009). Skrybot – A System for Automatic Speech Recognition of Polish Language. In: Cyran, K.A., Kozielski, S., Peters, J.F., Stańczyk, U., Wakulicz-Deja, A. (eds) Man-Machine Interactions. Advances in Intelligent and Soft Computing, vol 59. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00563-3_40
Download citation
DOI: https://doi.org/10.1007/978-3-642-00563-3_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00562-6
Online ISBN: 978-3-642-00563-3
eBook Packages: EngineeringEngineering (R0)