The Translectures-UPV Toolkit

del-Agua, M. A.; Giménez, A.; Serrano, N.; Andrés-Ferrer, J.; Civera, J.; Sanchis, A.; Juan, A.

doi:10.1007/978-3-319-13623-3_28

M. A. del-Agua²³,
A. Giménez²³,
N. Serrano²³,
J. Andrés-Ferrer²³,
J. Civera²³,
A. Sanchis²³ &
…
A. Juan²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8854))

813 Accesses
5 Citations

Abstract

Over the past few years, online multimedia educational repositories have increased in number and popularity. The main aim of the transLectures project is to develop cost-effective solutions for producing accurate transcriptions and translations for large video lecture repositories, such as VideoLectures.NET or the Universitat Politècnica de València’s repository, poliMedia. In this paper, we present the transLectures-UPV toolkit (TLK), which has been specifically designed to meet the requirements of the transLectures project, but can also be used as a conventional ASR toolkit. The main features of the current release include HMM training and decoding with speaker adaptation techniques (fCMLLR). TLK has been tested on the VideoLectures.NET and poliMedia repositories, yielding very competitive results. TLK has been released under the permissive open source Apache License v2.0 and can be directly downloaded from the transLectures website.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Final report on massive adaptation (M36). To be delivered on October 2014 (2014)
Google Scholar
First report on massive adaptation (M12), https://www.translectures.eu/wp-content/uploads/2013/05/transLectures-D3.1.1-18Nov2012.pdf
Opencast Matterhorn, http://opencast.org/matterhorn/
sclite - Score speech recognition system output, http://www1.icsi.berkeley.edu/Speech/docs/sctk-1.2/sclite.htm
Second report on massive adaptation (M24), https://www.translectures.eu//wp-content/uploads/2014/01/transLectures-D3.1.2-15Nov2013.pdf
TLK: The transLectures-UPV Toolkit, https://www.translectures.eu/tlk/
Baum, L.E., Petrie, T., Soules, G., Weiss, N.: A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains. The Annals of Mathematical Statistics 41(1), 164–171 (1970)
Article MATH MathSciNet Google Scholar
Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing 20(1), 30–42 (2012)
Article Google Scholar
Digalakis, V., Rtischev, D., Neumeyer, L., Sa, E.: Speaker Adaptation Using Constrained Estimation of Gaussian Mixtures. IEEE Transactions on Speech and Audio Processing 3, 357–366 (1995)
Article Google Scholar
Huang, J.T., Li, J., Yu, D., Deng, L., Gong, Y.: Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers. In: Proc. of ICASSP (2013)
Google Scholar
Munteanu, C., Baecker, R., Penn, G., Toms, E., James, D.: The Effect of Speech Recognition Accuracy Rates on the Usefulness and Usability of Webcast Archives. In: Proc. of CHI, pp. 493–502 (2006)
Google Scholar
Ney, H., Ortmanns, S.: Progress in dynamic programming search for LVCSR. Proceedings of the IEEE 88(8), 1224–1240 (2000)
Article Google Scholar
Ortmanns, S., Ney, H., Eiden, A.: Language-model look-ahead for large vocabulary speech recognition. In: Proc. of ICSLP, vol. 4, pp. 2095–2098 (1996)
Google Scholar
Ortmanns, S., Ney, H., Aubert, X.: A word graph algorithm for large vocabulary continuous speech recognition. Computer Speech and Language 11(1), 43–72 (1997)
Article Google Scholar
Povey, D., et al.: The Kaldi Speech Recognition Toolkit. In: Proc. of ASRU (2011)
Google Scholar
Rumelhart, D., Hintont, G., Williams, R.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Article Google Scholar
Rybach, D., et al.: The RWTH Aachen University Open Source Speech Recognition System. In: Proc. Interspeech, pp. 2111–2114 (2009)
Google Scholar
Seide, F., Li, G., Chen, X., Yu, D.: Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription. In: Proc. of ASRU, pp. 24–29 (2011)
Google Scholar
Viterbi, A.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory 13(2), 260–269 (1967)
Article MATH Google Scholar
Young, S., et al.: The HTK Book. Cambridge University Engineering Department (1995)
Google Scholar
Young, S.J., Odell, J.J., Woodland, P.C.: Tree-based state tying for high accuracy acoustic modelling. In: Proc. of HLT, pp. 307–312 (1994)
Google Scholar

Download references

Author information

Authors and Affiliations

MLLP, DSIC, Universitat Politècnica de València (UPV), Camí de Vera s/n, 46022, València, Spain
M. A. del-Agua, A. Giménez, N. Serrano, J. Andrés-Ferrer, J. Civera, A. Sanchis & A. Juan

Authors

M. A. del-Agua
View author publications
You can also search for this author in PubMed Google Scholar
A. Giménez
View author publications
You can also search for this author in PubMed Google Scholar
N. Serrano
View author publications
You can also search for this author in PubMed Google Scholar
J. Andrés-Ferrer
View author publications
You can also search for this author in PubMed Google Scholar
J. Civera
View author publications
You can also search for this author in PubMed Google Scholar
A. Sanchis
View author publications
You can also search for this author in PubMed Google Scholar
A. Juan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ETSIT, Las Palmas de Gran Canaria, Spain
Juan Luis Navarro Mesa , Eduardo Hernández Pérez , Pedro Quintana Morales , Antonio Ravelo García & Iván Guerra Moreno , , , &
University of Zaragoza, Spain
Alfonso Ortega
Dep. of Electronics, Telecommunications and Informatics Engineering, University of Aveiro, Portugal
António Teixeira
ATVS Biometric Recognition Group,, Universidad Autónoma de Madrid, Spain
Doroteo T. Toledano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

del-Agua, M.A. et al. (2014). The Translectures-UPV Toolkit. In: Navarro Mesa, J.L., et al. Advances in Speech and Language Technologies for Iberian Languages. Lecture Notes in Computer Science(), vol 8854. Springer, Cham. https://doi.org/10.1007/978-3-319-13623-3_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-13623-3_28
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13622-6
Online ISBN: 978-3-319-13623-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics