Recognition of Malayalam Documents

Neeba, N.V.; Namboodiri, Anoop; Jawahar, C.V.; Narayanan, P.J.

doi:10.1007/978-1-84800-330-9_6

N.V. Neeba³,
Anoop Namboodiri³,
C.V. Jawahar³ &
…
P.J. Narayanan³

Part of the book series: Advances in Pattern Recognition ((ACVPR))

728 Accesses
7 Citations

Abstract

Malayalam is an Indian language spoken by 40 million people with its own script. It has a rich literary tradition. A character recognition system for this language will be of immense help in a spectrum of applications ranging from data entry to reading aids. The Malayalam script has a large number of similar characters making the recognition problem challenging. In this chapter, we present our approach for recognition of Malayalam documents, both printed and handwritten. Classification results as well as ongoing activities are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bishop Robert Caldwell: Comparative Grammar of Dravidian Languages (1875).
Google Scholar
Nagy, G. and Seth, S.C.: Hierarchical Representation of Optically Scanned Documents. Proceedings of the 7th International Conference on Pattern Recognition, Montreal (1984) 347–349.
Google Scholar
Ulichney, R.: Digital Halftoning. The MIT Press, Cambridge, MA, (1987).
Google Scholar
Ulloor S Parameswara Iyer: Kerala Sahitya Charitram, Vol 1–5 (in Malayalam) Kerala University Press, Trivandrum, 1953.
Google Scholar
Fujisawa, H., Nakano, Y., and Kurino, K.: Segmentation Methods for Character Recognition: From Segmentation to Document Structure Analysis. in Proceedings of the IEEE 80, (1992) 1079–1092.
Google Scholar
Haralick, R.M.: Document Image Understanding: Geometric and Logical Layout. in Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Seattle, WA (1994) pp. 385–390.
Google Scholar
Jain, A.K. and Yu, B.: Document Representation and its Application to Page Decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, (1998) 294–308.
Article Google Scholar
Nagy G.: Twenty Years of Document Image Analysis in PAMI. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, (2000) 38–62.
Article Google Scholar
Trier, D., Jain, A.K., and Taxt, T.: Feature Extraction Methods for Character Recognition – A Survey. Pattern Recognition 29 (4), (1996) 641–662.
Article Google Scholar
Bagdanov, A.D. and Worring, M.: First Order Gaussian Graphs for Efficient Structure Classification. Pattern Recognition 36, (2003) 1311–1324.
Article MATH Google Scholar
Yamashita, A., Amano, T., Takahashi, I., rand Toyokawa, K.: A Model-based Layout Understanding Method for the Document Recognition System. in Proceedings of the International Conference on Document Analysis and Recognition, Saint-Malo, France (1991) pp. 130–138.
Google Scholar
Kreich, J., Luhn, A., and Maderlechner, G.: An Experimental Environment for Model-Based Document Analysis. in Proceedings of the International Conference on Document Analysis and Recognition, Saint-Malo, France (1991), pp. 50–58.
Google Scholar
Niyogi, D. and Srihari, S.N.: Knowledge-Based Derivation of Document Logical Structure. in Proceedings of the International Conference on Document Analysis and Recognition, Montreal, Canada (1995), pp. 472–475.
Google Scholar
Mao, S. and Kanungo, T.: Empirical Performance Evaluation Methodology and its Application to Page Segmentation Algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence 23 (2001), 242–256.
Article Google Scholar
Artires, T.: Poorly Structured Handwritten Documents Segmentation using Continuous Probabilistic Feature Grammars. in Workshop on Document Layout Interpretation and its Applications (DLIA2003).
Google Scholar
Namboodiri, A.M. and Jain, A.K.: Robust Segmentation of Unconstrained On-line Handwritten Documents. in Proceedings of the Fourth Indian Conference on Computer Vision, Graphics and Image Processing, Calcutta, India (2004), 165–170.
Google Scholar
Chalasani, Tejo Krishna, Namboodiri, Anoop, and Jawahar, C.V.: Support Vector Machine based Hierarchical Classifiers for Large Class Problems. in Proceedings of the sixth International Conference on Advances in Pattern Recognition, Kolkata, India (2007).
Google Scholar
Sesh Kumar, K.S., Kumar, Sukesh, and Jawahar, C.V.: On Segmentation of Documents in Complex Scripts. in Proceedings of International Conference on Document Analysis and Recognition, Brazil (2007), 1243–1247.
Google Scholar
Sesh Kumar, K.S., Namboodiri, Anoop M., and Jawahar, C.V.: Learning Segmentation of Documents with Complex Scripts. in Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, Madurai, India (2006), pp. 749–760.
Google Scholar
Neeba, N.V. and Jawahar, C.V.: Recognition of Books by Verification and Retraining. in Proceedings of the International Conference on Pattern Recognition, Tampa, Florida (2008).
Google Scholar
Alahari, Karteek, Lahari, Satya P., and Jawahar, C.V.: Discriminant Substrokes for Online Handwriting Recognition. in Proceedings of the International Conference on Document Analysis and Recognition, Seoul, Korea (2005), 499–503.
Google Scholar
NIST : NIST Scientific and Technical Databases, http://www.nist.gov/srd/.
LAMP: Documents and Standards Information, http://documents.cfar.umd.edu/resources/database/
Anand Kumar, A. Balasubramanian, Anoop M. Namboodiri and C.V. Jawahar: Model-Based Annotation of Online Handwritten Datasets. in Proceedings of IWFHR-2006, October 23-26, 2006, La Baule, France.
Google Scholar
Karteek Alahari, Satya Lahari Putrevu, and Jawahar, C.V.: Learning Mixtures of Offline and Online Features for Handwritten Stroke Recognition. in Proceedings of International Conference on Pattern Recognition, Hong Kong, Aug 2006, Vol. III, pp.379-382.
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Visual Information Technology, International Institute of Information Technology, Hyderabad, India
N.V. Neeba, Anoop Namboodiri, C.V. Jawahar & P.J. Narayanan

Authors

N.V. Neeba
View author publications
You can also search for this author in PubMed Google Scholar
Anoop Namboodiri
View author publications
You can also search for this author in PubMed Google Scholar
C.V. Jawahar
View author publications
You can also search for this author in PubMed Google Scholar
P.J. Narayanan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to N.V. Neeba .

Editor information

Editors and Affiliations

Analysis & Recognition (CEDAR), Center of Excellence for Document, Lee Entrance 520, Amherst, 14228, U.S.A.
Venu Govindaraju
Analysis & Recognition (CEDAR), Center of Excellence for Document, Lee Entrance 520, Amherst, 14228, U.S.A.
Srirangaraj (Ranga) Setlur

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Neeba, N., Namboodiri, A., Jawahar, C., Narayanan, P. (2009). Recognition of Malayalam Documents. In: Govindaraju, V., Setlur, S. (eds) Guide to OCR for Indic Scripts. Advances in Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-84800-330-9_6

Download citation

DOI: https://doi.org/10.1007/978-1-84800-330-9_6
Published: 28 August 2009
Publisher Name: Springer, London
Print ISBN: 978-1-84800-329-3
Online ISBN: 978-1-84800-330-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics