Abstract
In this paper, we address handwritten digit classification as a special problem of data compression modeling. The creation of the models—usually known as training—is just a process of counting. Moreover, the model associated to each class can be trained independently of all the other class models. Also, they can be updated later with new examples, even if the old ones are not available anymore. Under this framework, we show that it is possible to attain a classification accuracy consistently above 99.3% on the MNIST dataset, using classifiers trained in less than one hour on a common laptop.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Benenson, R.: Are We There Yet? (2013). http://rodrigob.github.io/are_we_there_yet/build/. Accessed 15 Sept 2017
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Bennett, C.H., Gács, P., Vitányi, M.L.P.M.B., Zurek, W.H.: Information distance. IEEE Trans. Inf. Theory 44(4), 1407–1423 (1998)
Chaitin, G.J.: On the length of programs for computing finite binary sequences. J. ACM 13, 547–569 (1966)
Chang, J.R., Chen, Y.S.: Batch-normalized Maxout network in network. Technical report arXiv: 1511.02583v1, November 2015
Cilibrasi, R., Vitányi, P.M.B.: Clustering by compression. IEEE Trans. Inf. Theory 51(4), 1523–1545 (2005)
Cireşan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, June 2012. Supp material in Ciresan-2012aa.pdf
Cireşan, D.C., Meier, U., Gambardella, L.M., Schmidhuber, J.: Deep, big, simple neural nets excel on handwritten digit recognition. Neural Comput. 22, 3207–3220 (2010)
Cohen, A.R., Vitányi, P.M.B.: Normalized compression distance of multisets with applications. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1602–1614 (2015)
Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-min sketch and its applications. J. Algorithms 55, 29–38 (2005)
Flajolet, P.: Approximate counting: a detailed analysis. BIT Numer. Math. 25(1), 113–134 (1985)
Graham, B.: Fractional max-pooling. Technical report arXiv: 1412.6071v4, May 2015
Hinton, G., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Kolmogorov, A.N.: Three approaches to the quantitative definition of information. Probl. Inf. Transm. 1(1), 1–7 (1965)
Krichevsky, R.E., Trofimov, V.K.: The performance of universal encoding. IEEE Trans. Inf. Theory 27(2), 199–207 (1981)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
LeCun, Y., Cortes, C., Burges, C.J.C.: The MNIST Database of Handwritten Digits (1998). http://yann.lecun.com/exdb/mnist/. Accessed 15 Sept 2017
Lee, C.Y., Gallagher, P.W., Tu, Z.: Generalizing pooling functions in convolutional neural networks: mixed, gated, and tree. In: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), Cadiz, Spain (2016)
Lee, C.Y., Xie, S., Gallagher, P.W., Zhang, Z., Tu, Z.: Deeply-supervised nets. In: Proceedings of the 18th International Conference on Artificial Intelligence and Statistics (AISTATS), San Diego, CA, USA (2015)
Li, M., Chen, X., Li, X., Ma, B., Vitányi, P.M.B.: The similarity metric. IEEE Trans. Inf. Theory 50(12), 3250–3264 (2004)
Liang, M., Hu, X.: Recurrent convolutional neural network for object recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, June 2015
Liao, Z., Carneiro, G.: Competitive multi-scale convolution. Technical report arXiv: 1511.05635v1, November 2015
Liao, Z., Carneiro, G.: On the importance of normalisation layers in deep learning with piecewise linear activation units. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, NY, USA, March 2016
Mairal, J., Koniusz, P., Harchaoui, Z., Schmid, C.: Convolutional kernel networks. In: Proceedings of Neural Information Processing Systems (NIPS), Montreal, Canada, December 2014
Mallick, S.: Handwritten Digits Classification: An OpenCV (C++/Python) Tutorial (2017). https://www.learnopencv.com/handwritten-digits-classification-an-opencv-c-python-tutorial/. Accessed 15 Sept 2017
McDonnell, M.D., Vladusich, T.: Enhanced image classification with a fast-learning shallow convolutional neural network. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, July 2015
Morris, R.: Counting large numbers of events in small registers. Commun. ACM 21, 840–842 (1978)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Pinho, A.J., Pratas, D., Ferreira, P.J.S.G.: Authorship attribution using relative compression. In: Proceedings of the Data Compression Conference, DCC-2016, Snowbird, Utah, March 2016
Ranzato, M.A., Poultney, C., Chopra, S., LeCun, Y.: Efficient learning of sparse representations with an energy-based model. In: Proceedings of Neural Information Processing Systems (NIPS), vol. 19 (2006)
Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)
Rissanen, J.: A universal prior for integers and estimation by minimum description length. Ann. Stat. 11(2), 416–431 (1983)
Rissanen, J.: Universal coding, information, prediction, and estimation. IEEE Trans. Inf. Theory 30(4), 629–636 (1984)
Sato, I., Nishimura, H., Yokoi, K.: APAC: augmented PAttern Classification with neural networks. Technical report arXiv: 1505.03229v1, May 2015
Solomonoff, R.J.: A formal theory of inductive inference. Part I. Inf. Control 7(1), 1–22 (1964)
Solomonoff, R.J.: A formal theory of inductive inference. Part II. Inf. Control 7(2), 224–254 (1964)
Tishby, N., Pereira, F., Bialek, W.: The information bottleneck principle. In: Proceedings of the 37th Annual Allerton Conference on Communication, Control, and Computing, Illinois, USA, pp. 368–377 (1999)
Tishby, N., Zaslavsky, N.: Deep learning and the information bottleneck principle. In: IEEE Information Theory Workshop (ITW), Jerusalem, Israel, April 2015
Wallace, C.S., Boulton, D.M.: An information measure for classification. Comput. J. 11(2), 185–194 (1968)
Wan, L., Zeiler, M., Zhang, S., LeCun, Y., Fergus, R.: Regularization of neural network using DropConnect. In: Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA, pp. 1058–1066 (2013)
Wang, D., Tan, X.: Unsupervised feature learning with C-SVDDNet. Pattern Recognit. 60, 473–485 (2016)
Ziv, J.: On classification with empirically observed statistics and universal data compression. IEEE Trans. Inf. Theory 34(2), 278–286 (1988)
Ziv, J., Merhav, N.: A measure of relative entropy between individual sequences with application to universal classification. IEEE Trans. Inf. Theory 39(4), 1270–1279 (1993)
Acknowledgments
This work was partially funded by National Funds through the FCT-Foundation for Science and Technology, in the context of the projects UID/CEC/00127/2013 and PTDC/EEI-SII/6608/2014, and also by the Integrated Programme of SR&TD “SOCA” (Ref. CENTRO-01-0145-FEDER-000010), co-funded by Centro 2020 program, Portugal 2020, European Union, through the European Regional Development Fund.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Pinho, A.J., Pratas, D. (2018). An Application of Data Compression Models to Handwritten Digit Classification. In: Blanc-Talon, J., Helbert, D., Philips, W., Popescu, D., Scheunders, P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2018. Lecture Notes in Computer Science(), vol 11182. Springer, Cham. https://doi.org/10.1007/978-3-030-01449-0_41
Download citation
DOI: https://doi.org/10.1007/978-3-030-01449-0_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01448-3
Online ISBN: 978-3-030-01449-0
eBook Packages: Computer ScienceComputer Science (R0)