A Proof-of-Concept for Orthographic Named Entity Correction in Spanish Voice Queries

Schneider, Julián Moreno; Fernández, José Luis Martínez; Martínez, Paloma

doi:10.1007/978-3-319-12093-5_10

A Proof-of-Concept for Orthographic Named Entity Correction in Spanish Voice Queries

Julián Moreno Schneider¹⁷,
José Luis Martínez Fernández¹⁸ &
Paloma Martínez¹⁷

Conference paper
First Online: 29 October 2014

799 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8382))

Abstract

Automatic speech recognition (ASR) systems are not able to recognize entities that are not present in its vocabulary. The problem considered in this paper is the misrecognition of named entities in Spanish voice queries introducing a proof-of-concept for named entity correction that provides alternative entities to the ones incorrectly recognized or misrecognized by retrieving entities phonetically similar from a dictionary. This system is domain-dependent, using sports news, especially football news, regardless of the automatic speech recognition system used. The correction process exploits the query structure and its semantic information to detect where a named entity appears. The system finds the most suitable alternative entity from a dictionary previously generated with the existing named entities.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Jeong, M.: Using higher-level linguistic knowledge for speech recognition error correction in a spoken QA dialog. In: Proceedings of the HLT-NAACL Special Workshop on Higher-Level Linguistic Information for Speech Processing, pp. 48–55 (2004)
Google Scholar
Kaki, S., Eiichiro Sumita, and Hitoshi Iida.: A Method for Correcting Speech Recognition Using the Statistical features of Character Co-occurrence, COLING-ACL’98, 653–657 (1998)
Google Scholar
Ringger, E.K., Allen, J.F.: A fertility model for post correction of continuous speech recognition ICSLP’96, pp. 897–900 (1996)
Google Scholar
Sarma, A., Palmer, D.: Context-based speech recognition error detection and correction. In: Proceedings of HLT-NAACL (2004)
Google Scholar
Ringger, E.K., Allen, J.F.: Error correction via a post-processor for continuous speech recognition. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 427–430, Atlanta, GA (1996)
Google Scholar
Ogata, J., Goto, M.: Speech repair: quick error correction just by using selection operation for speech input interfaces. In: Proceedings of Eurospeech’05, pp. 133–136 (2005)
Google Scholar
Reyes-Barragán, A., Villaseñor-Pineda, L., Montes-y-Gómez, M.: Expansión fonética de la consulta para la recuperación de información en documentos hablados. Septiembre, 2011 Procesamiento del Lenguaje Natural, Revista nº 47, pp. 57–64 (2011)
Google Scholar
Gil, J. Transcripción fonética: Representación escrita de los sonidos que pronunciamos. Fonética para profesores de español: De la teoría a la práctica. p. 547. Arco/Libros (2007)
Google Scholar
Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York (1997)
Google Scholar
Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: Proceedings of II Web 2003 – IJCAI Workshop on Information Integration on the Web, pp. 73–78 (2003)
Google Scholar
LivingSpanish: Correspondencia de fonemas y grafías en español. http://www.livingspanish.com/correspondencia-fonetica-grafia.htm (2011)
Fiscus, J.G., Ajot, J., Garofolo, J.S., Doddington, G.: Results of the 2006 spoken term detection evaluation, pp. 45–50 (2007)
Google Scholar

Download references

Acknowledgments

This work has been partially supported by the Regional Government of Madrid under the Research Network MA2VICMR (S2009/TIC-1542) and by the Spanish Center for Industry Technological Development (CDTI, Ministry of Industry, Tourism and Trade) through the BUSCAMEDIA Project (CEN-20091026).

Author information

Authors and Affiliations

Computer Science Department, Universidad Carlos III de Madrid, Avda. Universidad 30, 28911, Leganés, Madrid, Spain
Julián Moreno Schneider & Paloma Martínez
DAEDALUS – Data, Decisions and Language S.a., Avda. de La Albufera 321, 28031, Madrid, Spain
José Luis Martínez Fernández

Authors

Julián Moreno Schneider
View author publications
You can also search for this author in PubMed Google Scholar
José Luis Martínez Fernández
View author publications
You can also search for this author in PubMed Google Scholar
Paloma Martínez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julián Moreno Schneider .

Editor information

Editors and Affiliations

Otto-von-Guericke-Universität Magdeburg, Magdeburg, Germany
Andreas Nürnberger
Otto-von-Guericke-Universität Magdeburg, Magdeburg, Germany
Sebastian Stober
Royal School of Library and Information Science, Copenhagen, Denmark
Birger Larsen
Université Pierre et Marie Curie, Paris, France
Marcin Detyniecki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schneider, J.M., Fernández, J.L.M., Martínez, P. (2014). A Proof-of-Concept for Orthographic Named Entity Correction in Spanish Voice Queries. In: Nürnberger, A., Stober, S., Larsen, B., Detyniecki, M. (eds) Adaptive Multimedia Retrieval: Semantics, Context, and Adaptation. AMR 2012. Lecture Notes in Computer Science(), vol 8382. Springer, Cham. https://doi.org/10.1007/978-3-319-12093-5_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-12093-5_10
Published: 29 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12092-8
Online ISBN: 978-3-319-12093-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics