Skip to main content

A Proof-of-Concept for Orthographic Named Entity Correction in Spanish Voice Queries

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8382))

Abstract

Automatic speech recognition (ASR) systems are not able to recognize entities that are not present in its vocabulary. The problem considered in this paper is the misrecognition of named entities in Spanish voice queries introducing a proof-of-concept for named entity correction that provides alternative entities to the ones incorrectly recognized or misrecognized by retrieving entities phonetically similar from a dictionary. This system is domain-dependent, using sports news, especially football news, regardless of the automatic speech recognition system used. The correction process exploits the query structure and its semantic information to detect where a named entity appears. The system finds the most suitable alternative entity from a dictionary previously generated with the existing named entities.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Jeong, M.: Using higher-level linguistic knowledge for speech recognition error correction in a spoken QA dialog. In: Proceedings of the HLT-NAACL Special Workshop on Higher-Level Linguistic Information for Speech Processing, pp. 48–55 (2004)

    Google Scholar 

  2. Kaki, S., Eiichiro Sumita, and Hitoshi Iida.: A Method for Correcting Speech Recognition Using the Statistical features of Character Co-occurrence, COLING-ACL’98, 653–657 (1998)

    Google Scholar 

  3. Ringger, E.K., Allen, J.F.: A fertility model for post correction of continuous speech recognition ICSLP’96, pp. 897–900 (1996)

    Google Scholar 

  4. Sarma, A., Palmer, D.: Context-based speech recognition error detection and correction. In: Proceedings of HLT-NAACL (2004)

    Google Scholar 

  5. Ringger, E.K., Allen, J.F.: Error correction via a post-processor for continuous speech recognition. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 427–430, Atlanta, GA (1996)

    Google Scholar 

  6. Ogata, J., Goto, M.: Speech repair: quick error correction just by using selection operation for speech input interfaces. In: Proceedings of Eurospeech’05, pp. 133–136 (2005)

    Google Scholar 

  7. Reyes-Barragán, A., Villaseñor-Pineda, L., Montes-y-Gómez, M.: Expansión fonética de la consulta para la recuperación de información en documentos hablados. Septiembre, 2011 Procesamiento del Lenguaje Natural, Revista nº 47, pp. 57–64 (2011)

    Google Scholar 

  8. Gil, J. Transcripción fonética: Representación escrita de los sonidos que pronunciamos. Fonética para profesores de español: De la teoría a la práctica. p. 547. Arco/Libros (2007)

    Google Scholar 

  9. Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York (1997)

    Google Scholar 

  10. Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks. In: Proceedings of II Web 2003 – IJCAI Workshop on Information Integration on the Web, pp. 73–78 (2003)

    Google Scholar 

  11. LivingSpanish: Correspondencia de fonemas y grafías en español. http://www.livingspanish.com/correspondencia-fonetica-grafia.htm (2011)

  12. Fiscus, J.G., Ajot, J., Garofolo, J.S., Doddington, G.: Results of the 2006 spoken term detection evaluation, pp. 45–50 (2007)

    Google Scholar 

Download references

Acknowledgments

This work has been partially supported by the Regional Government of Madrid under the Research Network MA2VICMR (S2009/TIC-1542) and by the Spanish Center for Industry Technological Development (CDTI, Ministry of Industry, Tourism and Trade) through the BUSCAMEDIA Project (CEN-20091026).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julián Moreno Schneider .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Schneider, J.M., Fernández, J.L.M., Martínez, P. (2014). A Proof-of-Concept for Orthographic Named Entity Correction in Spanish Voice Queries. In: Nürnberger, A., Stober, S., Larsen, B., Detyniecki, M. (eds) Adaptive Multimedia Retrieval: Semantics, Context, and Adaptation. AMR 2012. Lecture Notes in Computer Science(), vol 8382. Springer, Cham. https://doi.org/10.1007/978-3-319-12093-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12093-5_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12092-8

  • Online ISBN: 978-3-319-12093-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics