Abstract
In this paper, we explore the effectiveness of bagged classification trees, in solving the writer identification problem in the Tamil language. Unlike other languages, in Tamil the writer identification problem is mostly an unexplored problem. Novel feature extraction methods tailored to better understand Tamil characters have been proposed. The feature extraction methods used in this paper are chosen after analysing the statistical spread of a feature across different handwriting classes. We have also analysed how increasing the number of bagged classification trees would affect the classification accuracy. Our learning algorithm is trained with hundred and forty four samples and is tested with twenty different samples per handwriting style. In total the algorithm is trained with ten different handwriting styles. Using the proposed features and bagged classification trees, we achieve 76.4 % accuracy. The practicality of the proposed method is also analysed using a few time consumption measuring parameters.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bensefia, A., Paquet, T., Heutte, L.: A writer identification and verification system. Pattern Recogn. Lett. 26(13), 2080–2092 (2005). Computer Vision ECCV 2010. Springer, Berlin Heidelberg, pp. 448–461 (2010)
Bulacu, M., Schomaker, L., Brink, A.: Text-independent writer identification and verification on offline arabic handwriting. In: Ninth International Conference on Document Analysis and Recognition, 2007, ICDAR 2007, vol. 2. IEEE (2007)
He, Z., You, X., Tang, Y.Y.: Writer identification of Chinese handwriting documents using hidden Markov tree model. Pattern Recogn. 41(4), 1295–1307 (2008)
Justino, E.J.R., Bortolozzi, F., Sabourin, R.: A comparison of SVM and HMM classifiers in the off-line signature verification. Pattern recognition letters 26(9), 1377–1385 (2005)
Quan, Z.-H., Liu, K.-H.: Online signature verification based on the hybrid HMM/ANN model. Int. J. Comput. Sci. Netw. Secur. 7(3), 313–322 (2007)
Chong, C.-W., Raveendran, P., Mukundan, R.: Translation invariants of Zernike moments. Pattern Recogn. 36(8), 1765–1773 (2003)
Bostanov, V.: BCI competition 2003-data sets Ib and IIb: feature extraction from event-related brain potentials with the continuous wavelet transform and the t-value scalogram. IEEE Trans. Biomed. Eng. 51(6), 1057–1061 (2004)
Li, W., Zhang, D., Zhuoqun, X.: Palmprint identification by Fourier transform. Int. J. Pattern Recognit Artif Intell. 16(04), 417–432 (2002)
Schiffman, H.F.: Linguistic culture and language policy. Psychology Press (1998)
Schiffman, H.F.: A reference grammar of spoken Tamil. Cambridge University Press, Cambridge (1999)
Jayanthi, S.K., Rajalakshmi, D.: Writer identification for offline Tamil handwriting based on gray-level co-occurrence matrices. In: Third International Conference onAdvanced Computing (ICoAC), 2011. IEEE (2011)
Harris, C., Stephens, M.: A combined corner and edge detector. Alvey vision conference, vol. 15 (1988)
Ballard, D.H.: Generalizing the Hough transform to detect arbitrary shapes. Pattern Recogn. 13(2), 111–122 (1981)
Handbook of character recognition and document image analysis (1997)
Su, Y., Shan, S., Chen, X., Gao, W.: Hierarchical ensemble of global and local classifiers for face recognition. IEEE Trans. Image Process. 18(8), 1885–1896 (2009)
Grabner, H., Bischof, H.: On-line boosting and vision. Computer Vision and Pattern Recognition, 2006 In: Conference on IEEE Computer Society, vol. 1. IEEE (2006)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Bryll, R., Gutierrez-Osuna, R., Quek, F.: Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recogn. 36(6), 1291–1302 (2003)
Breiman, L., et al.: Classification and regression trees. CRC Press (1984)
http://www.clarklabs.org/applications/upload/Classification-Tree-Analysis-IDRISI-Focus-Paper.pdf
Quinlan, J.R.: Bagging, boosting, and C4. 5. AAAI/IAAI, vol. 1 (1996)
Gelfand, S.B., Ravishankar, C.S., Delp, E.J.: An iterative growing and pruning algorithm for classification tree design. In: Conference Proceedings of IEEE International Conference on Systems, Man and Cybernetics, 1989. IEEE (1989)
HP-labs. Isolated Handwritten Tamil Character Dataset developed by HP india along with IISc (2006). http://lipitk.sourceforge.net/datasets/tamilchardata.htm (Accessed on 30 September 2010)
Wright, J., et al.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Babu, S. (2015). Offline Writer Identification in Tamil Using Bagged Classification Trees . In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2015. Lecture Notes in Computer Science(), vol 9166. Springer, Cham. https://doi.org/10.1007/978-3-319-21024-7_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-21024-7_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21023-0
Online ISBN: 978-3-319-21024-7
eBook Packages: Computer ScienceComputer Science (R0)