Recognition of Hand-Written Archive Text Documents
The processing of the large amount of hand-written archive documents is an unsolved problem. We propose a semi-automatic text recognition approach for those documents containing a limited size of vocabulary. Our approach is word based and uses the Scale Invariant Feature Transform for finding and describing saliency points of hand-written words. For testing we used a book of a Central-European city census of the year 1771 containing mainly Christian and family names. At reasonable database size we could achieve about 80% recognition rate.
KeywordsOptical character recognition Hand-written text recognition Feature extraction SIFT Archive document processing
Unable to display preview. Download preview PDF.
- 2.Kobayashi, T., Toyama, T., Shafait, F.L., Dengel, A., Iwamura, M., Kise, K.: Recognizing Words in Scenes with a Head-Mounted Eye-Tracker. In: 10th IAPR Workshop on Document Analysis Systems, DAS 2012, Gold Coast, Australia (2012) (Accepted for Publication)Google Scholar
- 3.Lavrenko, V., Rath, T.M., Manmatha, R.: Holistic Word Recognition for Handwritten Historical Documents. In: International Workshop on Document Image Analysis for Libraries, pp. 278–287 (2004)Google Scholar
- 4.Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999)Google Scholar
- 5.Rath, T.M., Manmatha, R.: Word Spotting for Historical Documents. Int. Journal on Document Analysis and Recognition, 139–152 (2007)Google Scholar
- 6.Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann (2000)Google Scholar
- 7.de Zeeuw, F.: Slant Correction using Histograms, BSc Thesis, University of Groningen (2006)Google Scholar