Skip to main content

An Application of Data Compression Models to Handwritten Digit Classification

  • Conference paper
  • First Online:
  • 1172 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11182))

Abstract

In this paper, we address handwritten digit classification as a special problem of data compression modeling. The creation of the models—usually known as training—is just a process of counting. Moreover, the model associated to each class can be trained independently of all the other class models. Also, they can be updated later with new examples, even if the old ones are not available anymore. Under this framework, we show that it is possible to attain a classification accuracy consistently above 99.3% on the MNIST dataset, using classifiers trained in less than one hour on a common laptop.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Benenson, R.: Are We There Yet? (2013). http://rodrigob.github.io/are_we_there_yet/build/. Accessed 15 Sept 2017

  2. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bennett, C.H., Gács, P., Vitányi, M.L.P.M.B., Zurek, W.H.: Information distance. IEEE Trans. Inf. Theory 44(4), 1407–1423 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  4. Chaitin, G.J.: On the length of programs for computing finite binary sequences. J. ACM 13, 547–569 (1966)

    Article  MathSciNet  MATH  Google Scholar 

  5. Chang, J.R., Chen, Y.S.: Batch-normalized Maxout network in network. Technical report arXiv: 1511.02583v1, November 2015

  6. Cilibrasi, R., Vitányi, P.M.B.: Clustering by compression. IEEE Trans. Inf. Theory 51(4), 1523–1545 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  7. Cireşan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, June 2012. Supp material in Ciresan-2012aa.pdf

    Google Scholar 

  8. Cireşan, D.C., Meier, U., Gambardella, L.M., Schmidhuber, J.: Deep, big, simple neural nets excel on handwritten digit recognition. Neural Comput. 22, 3207–3220 (2010)

    Article  Google Scholar 

  9. Cohen, A.R., Vitányi, P.M.B.: Normalized compression distance of multisets with applications. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1602–1614 (2015)

    Article  Google Scholar 

  10. Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-min sketch and its applications. J. Algorithms 55, 29–38 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  11. Flajolet, P.: Approximate counting: a detailed analysis. BIT Numer. Math. 25(1), 113–134 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  12. Graham, B.: Fractional max-pooling. Technical report arXiv: 1412.6071v4, May 2015

  13. Hinton, G., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  14. Kolmogorov, A.N.: Three approaches to the quantitative definition of information. Probl. Inf. Transm. 1(1), 1–7 (1965)

    MathSciNet  Google Scholar 

  15. Krichevsky, R.E., Trofimov, V.K.: The performance of universal encoding. IEEE Trans. Inf. Theory 27(2), 199–207 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  16. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  17. LeCun, Y., Cortes, C., Burges, C.J.C.: The MNIST Database of Handwritten Digits (1998). http://yann.lecun.com/exdb/mnist/. Accessed 15 Sept 2017

  18. Lee, C.Y., Gallagher, P.W., Tu, Z.: Generalizing pooling functions in convolutional neural networks: mixed, gated, and tree. In: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), Cadiz, Spain (2016)

    Google Scholar 

  19. Lee, C.Y., Xie, S., Gallagher, P.W., Zhang, Z., Tu, Z.: Deeply-supervised nets. In: Proceedings of the 18th International Conference on Artificial Intelligence and Statistics (AISTATS), San Diego, CA, USA (2015)

    Google Scholar 

  20. Li, M., Chen, X., Li, X., Ma, B., Vitányi, P.M.B.: The similarity metric. IEEE Trans. Inf. Theory 50(12), 3250–3264 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  21. Liang, M., Hu, X.: Recurrent convolutional neural network for object recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, June 2015

    Google Scholar 

  22. Liao, Z., Carneiro, G.: Competitive multi-scale convolution. Technical report arXiv: 1511.05635v1, November 2015

  23. Liao, Z., Carneiro, G.: On the importance of normalisation layers in deep learning with piecewise linear activation units. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, NY, USA, March 2016

    Google Scholar 

  24. Mairal, J., Koniusz, P., Harchaoui, Z., Schmid, C.: Convolutional kernel networks. In: Proceedings of Neural Information Processing Systems (NIPS), Montreal, Canada, December 2014

    Google Scholar 

  25. Mallick, S.: Handwritten Digits Classification: An OpenCV (C++/Python) Tutorial (2017). https://www.learnopencv.com/handwritten-digits-classification-an-opencv-c-python-tutorial/. Accessed 15 Sept 2017

  26. McDonnell, M.D., Vladusich, T.: Enhanced image classification with a fast-learning shallow convolutional neural network. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, July 2015

    Google Scholar 

  27. Morris, R.: Counting large numbers of events in small registers. Commun. ACM 21, 840–842 (1978)

    Article  MATH  Google Scholar 

  28. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)

    Article  Google Scholar 

  29. Pinho, A.J., Pratas, D., Ferreira, P.J.S.G.: Authorship attribution using relative compression. In: Proceedings of the Data Compression Conference, DCC-2016, Snowbird, Utah, March 2016

    Google Scholar 

  30. Ranzato, M.A., Poultney, C., Chopra, S., LeCun, Y.: Efficient learning of sparse representations with an energy-based model. In: Proceedings of Neural Information Processing Systems (NIPS), vol. 19 (2006)

    Google Scholar 

  31. Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)

    Article  MATH  Google Scholar 

  32. Rissanen, J.: A universal prior for integers and estimation by minimum description length. Ann. Stat. 11(2), 416–431 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  33. Rissanen, J.: Universal coding, information, prediction, and estimation. IEEE Trans. Inf. Theory 30(4), 629–636 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  34. Sato, I., Nishimura, H., Yokoi, K.: APAC: augmented PAttern Classification with neural networks. Technical report arXiv: 1505.03229v1, May 2015

  35. Solomonoff, R.J.: A formal theory of inductive inference. Part I. Inf. Control 7(1), 1–22 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  36. Solomonoff, R.J.: A formal theory of inductive inference. Part II. Inf. Control 7(2), 224–254 (1964)

    Article  MathSciNet  MATH  Google Scholar 

  37. Tishby, N., Pereira, F., Bialek, W.: The information bottleneck principle. In: Proceedings of the 37th Annual Allerton Conference on Communication, Control, and Computing, Illinois, USA, pp. 368–377 (1999)

    Google Scholar 

  38. Tishby, N., Zaslavsky, N.: Deep learning and the information bottleneck principle. In: IEEE Information Theory Workshop (ITW), Jerusalem, Israel, April 2015

    Google Scholar 

  39. Wallace, C.S., Boulton, D.M.: An information measure for classification. Comput. J. 11(2), 185–194 (1968)

    Article  MATH  Google Scholar 

  40. Wan, L., Zeiler, M., Zhang, S., LeCun, Y., Fergus, R.: Regularization of neural network using DropConnect. In: Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA, pp. 1058–1066 (2013)

    Google Scholar 

  41. Wang, D., Tan, X.: Unsupervised feature learning with C-SVDDNet. Pattern Recognit. 60, 473–485 (2016)

    Article  Google Scholar 

  42. Ziv, J.: On classification with empirically observed statistics and universal data compression. IEEE Trans. Inf. Theory 34(2), 278–286 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  43. Ziv, J., Merhav, N.: A measure of relative entropy between individual sequences with application to universal classification. IEEE Trans. Inf. Theory 39(4), 1270–1279 (1993)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was partially funded by National Funds through the FCT-Foundation for Science and Technology, in the context of the projects UID/CEC/00127/2013 and PTDC/EEI-SII/6608/2014, and also by the Integrated Programme of SR&TD “SOCA” (Ref. CENTRO-01-0145-FEDER-000010), co-funded by Centro 2020 program, Portugal 2020, European Union, through the European Regional Development Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Armando J. Pinho .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pinho, A.J., Pratas, D. (2018). An Application of Data Compression Models to Handwritten Digit Classification. In: Blanc-Talon, J., Helbert, D., Philips, W., Popescu, D., Scheunders, P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2018. Lecture Notes in Computer Science(), vol 11182. Springer, Cham. https://doi.org/10.1007/978-3-030-01449-0_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01449-0_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01448-3

  • Online ISBN: 978-3-030-01449-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics