Abstract
In the fast pace development of digitized technologies, document images have become more fashionable for an information management system present in libraries, organization and educational institutions. Searching information from the document image is very difficult to perform as it compared with digital text. Optical Character Recognition (OCR) is employed to detect the characters and converts the images into their text format. OCR system is not properly converts the various fonts, styles, size, symbols, dark background and poor quality of the document images, however it’s not an efficient method. For this reason, there is a necessity for a searching strategy to find the user specified keywords from document images. Word spotting is an alternative method, whereas keyword is identified without changing the document images. The primary objective of this research work is to search the keywords from printed cursive English document images using word spotting techniques. In this research work, the Firefly based word spotting technique is proposed to search the keyword based on query given by the user. To estimate the efficiency the Firefly technique is compared with existing Enhanced Dynamic Time Warping (EDTW) technique. From the experimental analysis, the proposed Firefly based word spotting technique has produced high accuracy rate and less execution time compared with existing EDTW technique.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Kesidis, A.L., Galiotou, E., Gatos, B., Pratikakis, I.: A word spotting framework for historical machine-printed documents. IJDAR 14, 131–144 (2010). https://doi.org/10.1007/s10032-010-0134-4
O’Gorman, L., Kasturi, R.: Document Image Analysis. In: IEEE Computer Society Executive Briefings (2009). ISBN 0-8186-7802-X
Vijayarani, S., Sakila, A.: Performance comparison of OCR tools. Int. J. UbiComp (IJU) 6(3), 19–30 (2015). ISSN: 0975–8992 (Online); (Online); 0976–2213 (Print)
Giotis, A.P., Sfikas, G., Gatos, B., Nikou, C.: A survey of document image word spotting techniques. Pattern Recogn. 68, 310–332 (2017). ISSN 0031-3203
Sakila, A., Vijayarani, S.: Content based text information search and retrieval in document images for digital library. J. Digit. Inf. Manag. 16(3) (2018)
Singh, H., Kaur, H.: Content based Image Retrieval using Firefly algorithm and Neural Network. Int. J. Adv. Res. Comput. Sci. 8(1), (2017)
Jawahar, C.V., Balasubramanian, A., Meshesha, M.: Word-Level Access to Document Image Datasets. http://cdn.iiit.ac.in/cdn/cvit.iiit.ac.in/images/ConferencePapers/2004/jawahar04docimg.pdf
Sakila, A., Vijayarani, S.: A hybrid approach for document image binarization. In: International Conference on Inventive Computing and Informatics (ICICI) (2017). ISBN 645-650
Chai, H.Y., Supriyanto, E., Wee, L.K.: MRI brain tumor image segmentation using region-based active contour model. In: Latest Trends in Applied Computational Science. ISBN 978-1-61804-171-5
Manmatha, R., Han, C., Riseman, E.M.: Word spotting: a new approach to indexing handwriting. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 1996, pp. 631–637 (1996). ISBN 0-8186-7259-5
Javed, M., Nagabhusha, P., Chaudhuri, B.B.: Extraction of line word character segments directly from run length compressed printed text documents
Rath, T.M., Manmath, R.: Word Image Matching Using Dynamic Time Warping, January 2002. http://ciir-publications.cs.umass.edu/pdf/MM-38.pdf
Ali, N., Othman, M.A., Husain, M.N., Misran, M.H.: A review of firefly algorithm. ARPN J. Eng. Appl. Sci. 9(10) (2014). ISSN 1819-6608
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Sakila, A., Vijayarani, S. (2020). Firefly Based Word Spotting Technique for Searching Keywords from Cursive Document Images. In: Hemanth, D., Shakya, S., Baig, Z. (eds) Intelligent Data Communication Technologies and Internet of Things. ICICI 2019. Lecture Notes on Data Engineering and Communications Technologies, vol 38. Springer, Cham. https://doi.org/10.1007/978-3-030-34080-3_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-34080-3_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34079-7
Online ISBN: 978-3-030-34080-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)