Abstract
The segmentation of individual words is a crucial step in several data mining methods for historical handwritten documents. Examples of applications include visual searching for query words (word spotting) and character-by-character text recognition. In this paper, we present a novel method for word segmentation that is adapted from recent advances in computer vision, deep learning and generic object detection. Our method has unique capabilities and it has found practical use in our current research project. It can easily be trained for different kinds of historical documents, uses full gray scale information, does not require binarization as pre-processing or prior segmentation of individual text lines. We evaluate its performance using established error metrics, previously used in competitions for word segmentation, and demonstrate its usefulness for a 15th century handwritten document.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ryu, J., Koo, H.I., Cho, N.I.: Word segmentation method for handwritten documents based on structured learning. IEEE Signal Process. Lett. 22, 1161–1165 (2015)
Stafylakis, T., Papavassiliou, V., Katsouros, V., Carayannis, G.: Robust text-line and word segmentation for handwritten documents images. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2008, pp. 3393–3396 (2008)
Varga, T., Bunke, H.: Tree structure for word extraction from handwritten text lines. In: Proceedings Eighth International Conference on Document Analysis and Recognition, vol.1, pp. 352–356 (2005)
Manmatha, R., Rothfeder, J.L.: A scale space approach for automatically segmenting words from historical handwritten documents. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1212–1225 (2005)
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. arXiv preprint arXiv:1412.1842 (2014)
Gidaris, S., Komodakis, N.: Object detection via a multi-region & semantic segmentation-aware cnn model. arXiv preprint arXiv:1505.01749 (2015)
Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Bengio, Y., Pal, C., Jodoin, P.M., Larochelle, H.: Brain tumor segmentation with deep neural networks. arXiv preprint arXiv:1505.03540 (2015)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)
Bengio, Y., Goodfellow, I.J., Courville, A.: Deep Learning. Book in preparation for MIT Press (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587. IEEE (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part III. LNCS, vol. 8691, pp. 346–361. Springer, Heidelberg (2014)
Tang, Y.: Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239 (2013)
Stamatopoulos, N., Gatos, B., Louloudis, G., Pal, U., Alaei, A.: ICDAR 2013 handwriting segmentation contest. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1402–1406. IEEE (2013)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678. ACM (2014)
Kovalchuk, A., Wolf, L., Dershowitz, N.: A simple and fast word spotting method. In: 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 3–8 (2014)
Acknowledgment
This project is a part of q2b, From quill to bytes, a framework program sponsored by the Swedish Research Council (Dnr 2012-5743) and Uppsala university. The work is done in part as a collaboration with the Swedish Museum of Natural History (Naturhistoriska riksmuseet).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wilkinson, T., Brun, A. (2015). A Novel Word Segmentation Method Based on Object Detection and Deep Learning. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2015. Lecture Notes in Computer Science(), vol 9474. Springer, Cham. https://doi.org/10.1007/978-3-319-27857-5_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-27857-5_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27856-8
Online ISBN: 978-3-319-27857-5
eBook Packages: Computer ScienceComputer Science (R0)