Large Vocabulary Speech Recognition of Slovenian Language Using Data-Driven Morphological Models

Rotovnik, Tomaž; Maučec, Mirjam Sepesy; Horvat, Bogomir; Kačič, Zdravko

doi:10.1007/3-540-46154-X_46

Tomaž Rotovnik³,
Mirjam Sepesy Maučec³,
Bogomir Horvat³ &
…
Zdravko Kačič³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2448))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

556 Accesses
1 Citations

Abstract

A system for large vocabulary continuous speech recognition of the Slovenian language is described. Two types of modelling units are examined: words and subwords. A data-driven algorithm is used to automatically obtain word decompositions. The performances of one-pass and two-pass decoding strategies were compared. The new models gave promising results. Recognition accuracy was improved by 3.41% absolute at approx. the same recognition time. On the other hand we achieved 30% increase in real time performance at the same recognition error.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kačič, Z., Horvat, B., Zögling, A.: Isues in design and collection of large telephone speech corpus for Slovenian language, LREC 2000.
Google Scholar
Young, S., Odell, J., Ollason, D., Kershaw, D., Valtcheva, V., Woodland, P.: The HTK Book, Entropic Inc., 2000.
Google Scholar
Zhao, J., Hamaker, J., Deshmukh, N., Ganapathiraju, A., Picone, J.: Fast Recognition Techniques for Large Vocabulary Speech Recognition, Texas Instruments Incorporated, August 15, 1999.
Google Scholar
P. Clarkson, R. Rosenfeld: Statistical language modeling using the CMU-Cambridge toolkit. In: Proceedings of EuroSpeech, 1997.
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova 17, 2000, Maribor, Slovenia
Tomaž Rotovnik, Mirjam Sepesy Maučec, Bogomir Horvat & Zdravko Kačič

Authors

Tomaž Rotovnik
View author publications
You can also search for this author in PubMed Google Scholar
Mirjam Sepesy Maučec
View author publications
You can also search for this author in PubMed Google Scholar
Bogomir Horvat
View author publications
You can also search for this author in PubMed Google Scholar
Zdravko Kačič
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics Department of Programming Systems and Communication, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Petr Sojka
Faculty of Informatics Department of Information Technologies, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Ivan Kopeček & Karel Pala &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rotovnik, T., Maučec, M.S., Horvat, B., Kačič, Z. (2002). Large Vocabulary Speech Recognition of Slovenian Language Using Data-Driven Morphological Models. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2002. Lecture Notes in Computer Science(), vol 2448. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46154-X_46

Download citation

DOI: https://doi.org/10.1007/3-540-46154-X_46
Published: 23 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44129-8
Online ISBN: 978-3-540-46154-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics