Skip to main content

A New Smoothing Method for Lexicon-Based Handwritten Text Keyword Spotting

  • Conference paper
  • First Online:
Pattern Recognition and Image Analysis (IbPRIA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9117))

Included in the following conference series:

Abstract

Lexicon-based handwritten text keyword spotting (KWS) has proven to be a very fast and accurate alternative to lexicon-free methods. Nevertheless, since lexicon-based KWS methods rely on a predefined vocabulary, fixed in the training phase, they perform poorly for any query keyword that was not included in it (i.e. out-of-vocabulary keywords). This turns the KWS system useless for that particular type of queries. In this paper, we present a new way of smoothing the scores of OOV keywords, and we compare it with previously published alternatives on different data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recogn. Lett. 33(7), 934–942 (2012). special Issue on Awards from ICPR 2010

    Article  Google Scholar 

  2. Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 211–224 (2012)

    Article  Google Scholar 

  3. Kneser, R., Ney, H.: Improved backing-off for N-gram language modeling. In: International Conference on Acoustics. Speech and Signal Processing (ICASSP 1995), vol. 1, pp. 181–184. IEEE Computer Society, Los Alamitos (1995)

    Google Scholar 

  4. Manning, C.D., Raghavan, P., Schtze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)

    Book  MATH  Google Scholar 

  5. Puigcerver, J., Toselli, A.H., Vidal, E.: Word-graph and character-lattice combination for KWS in handwritten documents. In: 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 181–186 (2014)

    Google Scholar 

  6. Puigcerver, J., Toselli, A.H., Vidal, E.: Word-graph-based handwriting keyword spotting of out-of-vocabulary queries. In: 22nd International Conference on Pattern Recognition (ICPR), pp. 2035–2040 (2014)

    Google Scholar 

  7. Robertson, S.: A new interpretation of average precision. In: Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2008), pp. 689–690. ACM, New York (2008)

    Google Scholar 

  8. Rodriguez-Serrano, J.A., Perronnin, F.: Handwritten word-spotting using hidden markov models and universal vocabularies. Pattern Recogn. 42(9), 2106–2116 (2009). http://www.sciencedirect.com/science/article/pii/S0031320309000673

    Article  MATH  Google Scholar 

  9. Shang, H., Merrettal, T.: Tries for approximate string matching. IEEE Transac. Knowl. Data Eng. 8(4), 540–547 (1996)

    Article  Google Scholar 

  10. Toselli, A.H., Vidal, E.: Fast HMM-filler approach for key word spotting in handwritten documents. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 501–505 (2013)

    Google Scholar 

  11. Toselli, A.H., Vidal, E., Romero, V., Frinken, V.: Word-graph based keyword spotting and indexing of handwritten document images. Universitat Politcnica de Valncia, Technical report (2013)

    Google Scholar 

  12. Woodland, P., Leggetter, C., Odell, J., Valtchev, V., Young, S.: The 1994 HTK large vocabulary speech recognition system. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1995), vol. 1, pp. 73–76, May 1995

    Google Scholar 

Download references

Acknowledgments

This work was partially supported by the Spanish MEC under FPU grant FPU13/06281 and under the STraDA research project (TIN2012-37475-C02-01), by the Generalitat Valenciana under the grant Prometeo/2009/014, and through the EU 7th Framework Programme grant tranScriptorium (Ref: 600707).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joan Puigcerver .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Puigcerver, J., Toselli, A.H., Vidal, E. (2015). A New Smoothing Method for Lexicon-Based Handwritten Text Keyword Spotting. In: Paredes, R., Cardoso, J., Pardo, X. (eds) Pattern Recognition and Image Analysis. IbPRIA 2015. Lecture Notes in Computer Science(), vol 9117. Springer, Cham. https://doi.org/10.1007/978-3-319-19390-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19390-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19389-2

  • Online ISBN: 978-3-319-19390-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics