Skip to main content

Firefly Based Word Spotting Technique for Searching Keywords from Cursive Document Images

  • Conference paper
  • First Online:
  • 799 Accesses

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 38))

Abstract

In the fast pace development of digitized technologies, document images have become more fashionable for an information management system present in libraries, organization and educational institutions. Searching information from the document image is very difficult to perform as it compared with digital text. Optical Character Recognition (OCR) is employed to detect the characters and converts the images into their text format. OCR system is not properly converts the various fonts, styles, size, symbols, dark background and poor quality of the document images, however it’s not an efficient method. For this reason, there is a necessity for a searching strategy to find the user specified keywords from document images. Word spotting is an alternative method, whereas keyword is identified without changing the document images. The primary objective of this research work is to search the keywords from printed cursive English document images using word spotting techniques. In this research work, the Firefly based word spotting technique is proposed to search the keyword based on query given by the user. To estimate the efficiency the Firefly technique is compared with existing Enhanced Dynamic Time Warping (EDTW) technique. From the experimental analysis, the proposed Firefly based word spotting technique has produced high accuracy rate and less execution time compared with existing EDTW technique.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Kesidis, A.L., Galiotou, E., Gatos, B., Pratikakis, I.: A word spotting framework for historical machine-printed documents. IJDAR 14, 131–144 (2010). https://doi.org/10.1007/s10032-010-0134-4

    Article  Google Scholar 

  2. O’Gorman, L., Kasturi, R.: Document Image Analysis. In: IEEE Computer Society Executive Briefings (2009). ISBN 0-8186-7802-X

    Google Scholar 

  3. Vijayarani, S., Sakila, A.: Performance comparison of OCR tools. Int. J. UbiComp (IJU) 6(3), 19–30 (2015). ISSN: 0975–8992 (Online); (Online); 0976–2213 (Print)

    Article  Google Scholar 

  4. Giotis, A.P., Sfikas, G., Gatos, B., Nikou, C.: A survey of document image word spotting techniques. Pattern Recogn. 68, 310–332 (2017). ISSN 0031-3203

    Article  Google Scholar 

  5. Sakila, A., Vijayarani, S.: Content based text information search and retrieval in document images for digital library. J. Digit. Inf. Manag. 16(3) (2018)

    Article  Google Scholar 

  6. Singh, H., Kaur, H.: Content based Image Retrieval using Firefly algorithm and Neural Network. Int. J. Adv. Res. Comput. Sci. 8(1), (2017)

    Google Scholar 

  7. Jawahar, C.V., Balasubramanian, A., Meshesha, M.: Word-Level Access to Document Image Datasets. http://cdn.iiit.ac.in/cdn/cvit.iiit.ac.in/images/ConferencePapers/2004/jawahar04docimg.pdf

  8. Sakila, A., Vijayarani, S.: A hybrid approach for document image binarization. In: International Conference on Inventive Computing and Informatics (ICICI) (2017). ISBN 645-650

    Google Scholar 

  9. https://www.researchgate.net/publication/261253745_Extraction_of_Line_Word_Character_Segments_Directly_from_Run_Length_Compressed_Printed_Text_Documents

  10. Chai, H.Y., Supriyanto, E., Wee, L.K.: MRI brain tumor image segmentation using region-based active contour model. In: Latest Trends in Applied Computational Science. ISBN 978-1-61804-171-5

    Google Scholar 

  11. Manmatha, R., Han, C., Riseman, E.M.: Word spotting: a new approach to indexing handwriting. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 1996, pp. 631–637 (1996). ISBN 0-8186-7259-5

    Google Scholar 

  12. Javed, M., Nagabhusha, P., Chaudhuri, B.B.: Extraction of line word character segments directly from run length compressed printed text documents

    Google Scholar 

  13. Rath, T.M., Manmath, R.: Word Image Matching Using Dynamic Time Warping, January 2002. http://ciir-publications.cs.umass.edu/pdf/MM-38.pdf

  14. Ali, N., Othman, M.A., Husain, M.N., Misran, M.H.: A review of firefly algorithm. ARPN J. Eng. Appl. Sci. 9(10) (2014). ISSN 1819-6608

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Sakila .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sakila, A., Vijayarani, S. (2020). Firefly Based Word Spotting Technique for Searching Keywords from Cursive Document Images. In: Hemanth, D., Shakya, S., Baig, Z. (eds) Intelligent Data Communication Technologies and Internet of Things. ICICI 2019. Lecture Notes on Data Engineering and Communications Technologies, vol 38. Springer, Cham. https://doi.org/10.1007/978-3-030-34080-3_21

Download citation

Publish with us

Policies and ethics