Noisy Character Recognition Using Deep Convolutional Neural Networks

  • Sirlene Peixoto
  • Gabriel Gonçalves
  • Andrea Bianchi
  • Alceu De S. Brito
  • William Robson Schwartz
  • David Menotti
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10657)


Due to degradation and low quality in noisy images, such as natural scene images and CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) based on text, the character recognition problem continues to be extremely challenging. In this work, we study two convolutional neural network approaches (filter learning and architecture optimization) to improve the feature representations of these images through deep learning. We perform experiments in the widely used Street View House Numbers (SVHN) dataset and a new dataset of CAPTCHAs created by us. The approach to learn filter weights through back-propagation algorithm using data augmentation technique and the strategy of adding few locally-connected layers to the Convolutional Neural Network (CNN) has obtained promising results on the CAPTCHA dataset (97.36% of accuracy for characters and 85.4% for CAPTCHAs) and results very close to the state-of-the-art regarding the SVHN dataset (97.45% of accuracy for digits).


Deep learning Convolutional network Noisy character recognition CAPTCHAs Street house view numbers 



The authors thank UFPR, PUCPR, UFOP, UFMG, FAPEMIG, CAPES and CNPq (Grants #428333/2016-8 & # 307010/2014-7) for supporting this work.


  1. 1.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893 (2005)Google Scholar
  2. 2.
    Goodfellow, I.J., et al.: Multi-digit number recognition from street view imagery using deep CNNs. In: International Conference on Learning Representation (2014)Google Scholar
  3. 3.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)Google Scholar
  4. 4.
    Menotti, D., Chiachia, G., Falcao, A., Oliveira Neto, V.: Vehicle license plate recognition with random convolutional networks. In: 2014 27th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 298–303 (2014)Google Scholar
  5. 5.
    Menotti, D., et al.: Deep representations for iris, face, and fingerprint spoofing detection. IEEE Trans. Inf. Forensics Secur. 10(4), 864–879 (2015)CrossRefGoogle Scholar
  6. 6.
    Menze, B.H., Kelm, B.M., Splitthoff, D.N., Koethe, U., Hamprecht, F.A.: On oblique random forests. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011. LNCS (LNAI), vol. 6912, pp. 453–469. Springer, Heidelberg (2011). CrossRefGoogle Scholar
  7. 7.
    Netzer, Y., et al.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning, pp. 1–9 (2011)Google Scholar
  8. 8.
    Pinto, N., et al.: A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS 5(11), e1000579 (2009)MathSciNetGoogle Scholar
  9. 9.
    Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. arXiv preprint arXiv:1503.03832 (2015)
  10. 10.
    Sermanet, P., Chintala, S., LeCun, Y.: Convolutional neural networks applied to house numbers digit classification. In: ICPR, pp. 3288–3291 (2012)Google Scholar
  11. 11.
    Vajda, S., Rangoni, Y., Cecotti, H.: Semi-automatic ground truth generation using unsupervised clustering and limited manual labeling: application to handwritten character recognition. Pattern Recogn. Lett. 58, 23–28 (2015)CrossRefGoogle Scholar
  12. 12.
    Yuan, Y., Mou, L., Lu, X.: Scene recognition by manifold regularized deep learning architecture. IEEE Trans. Neural Netw. Learn. Syst. 26(10), 1–12 (2015)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Zhang, X., Trmal, J., Povey, D., Khudanpur, S.: Improving deep neural network acoustic models using generalized maxout networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 215–219 (2014)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Federal University of Ouro PretoOuro PretoBrazil
  2. 2.Federal University of Minas GeraisBelo HorizonteBrazil
  3. 3.Pontifical Catholic University of ParanáCuritibaBrazil
  4. 4.Federal University of ParanáCuritibaBrazil

Personalised recommendations