Abstract
Deep learning has had a profound impact on computer science in recent years, with applications to image recognition, language processing, bioinformatics, and more. Recently, Cohen et al. provided theoretical evidence for the superiority of deep learning over shallow learning. We formalized their mathematical proof using Isabelle/HOL. The Isabelle development simplifies and generalizes the original proof, while working around the limitations of the HOL type system. To support the formalization, we developed reusable libraries of formalized mathematics, including results about the matrix rank, the Borel measure, and multivariate polynomials as well as a library for tensor analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bader, B.W., Kolda, T.G.: Algorithm 862: MATLAB tensor classes for fast algorithm prototyping. ACM Trans. Math. Softw. 32(4), 635–653 (2006)
Bentkamp, A.: Expressiveness of deep learning. Archive of Formal Proofs (2016). http://isa-afp.org/entries/Deep_Learning.shtml. Formal proof development
Bentkamp, A.: An Isabelle formalization of the expressiveness of deep learning. M.Sc. thesis, Universität des Saarlandes (2016). http://matryoshka.gforge.inria.fr/pubs/bentkamp_msc_thesis.pdf
Bernard, S., Bertot, Y., Rideau, L., Strub, P.: Formal proofs of transcendence for \(e\) and \(\pi \) as an application of multivariate and symmetric polynomials. In: Avigad, J., Chlipala, A. (eds.) CPP 2016, pp. 76–87. ACM (2016)
Bertot, Y., Gonthier, G., Ould Biha, S., Pasca, I.: Canonical big operators. In: Mohamed, O.A., Muñoz, C., Tahar, S. (eds.) TPHOLs 2008. LNCS, vol. 5170, pp. 86–101. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-71067-7_11
Bhat, S.: Syntactic foundations for machine learning. Ph.D. thesis, Georgia Institute of Technology (2013). https://smartech.gatech.edu/bitstream/handle/1853/47700/bhat_sooraj_b_201305_phd.pdf
Blanchette, J.C., Böhme, S., Fleury, M., Smolka, S.J., Steckermeier, A.: Semi-intelligible Isar proofs from machine-generated proofs. J. Autom. Reason. 56(2), 155–200 (2016)
Blanchette, J.C., Greenaway, D., Kaliszyk, C., Kühlwein, D., Urban, J.: A learning-based fact selector for Isabelle/HOL. J. Autom. Reason. 57(3), 219–244 (2016)
Boender, J., Kammüller, F., Nagarajan, R.: Formalization of quantum protocols using Coq. In: Heunen, C., Selinger, P., Vicary, J. (eds.) QPL 2015. EPTCS, vol. 195, pp. 71–83 (2015)
Böhme, S., Weber, T.: Fast LCF-style proof reconstruction for Z3. In: Kaufmann, M., Paulson, L.C. (eds.) ITP 2010. LNCS, vol. 6172, pp. 179–194. Springer, Heidelberg (2010). 10.1007/978-3-642-14052-5_14
Bürgisser, P., Cucker, F., Lotz, M.: The probability that a slightly perturbed numerical analysis problem is difficult. Math. Comput. 77(263), 1559–1583 (2008)
Caron, R., Traynor, T.: The zero set of a polynomial. Technical report, University of Windsor (2005). http://www1.uwindsor.ca/math/sites/uwindsor.ca.math/files/05-03.pdf
Cohen, N., Sharir, O., Shashua, A.: Deep SimNets. In: CVPR 2016, pp. 4782–4791. IEEE Computer Society (2016)
Cohen, N., Sharir, O., Shashua, A.: On the expressive power of deep learning: a tensor analysis. In: Feldman, V., Rakhlin, A., Shamir, O. (eds.) COLT 2016. JMLR Workshop and Conference Proceedings, vol. 49, pp. 698–728. JMLR.org (2016)
Cohen, N., Shashua, A.: Convolutional rectifier networks as generalized tensor decompositions. In: Balcan, M., Weinberger, K.Q. (eds.) ICML 2016. JMLR Workshop and Conference Proceedings, vol. 48, pp. 955–963. JMLR.org (2016)
Cohen, N., Shashua, A.: Inductive bias of deep convolutional networks through pooling geometry. CoRR abs/1605.06743 (2016)
Haftmann, F., Lochbihler, A., Schreiner, W.: Towards abstract and executable multivariate polynomials in Isabelle. In: Nipkow, T., Paulson, L., Wenzel, M. (eds.) Isabelle Workshop 2014 (2014)
Harrison, J.: A HOL theory of Euclidean space. In: Hurd, J., Melham, T. (eds.) TPHOLs 2005. LNCS, vol. 3603, pp. 114–129. Springer, Heidelberg (2005). 10.1007/11541868_8
Hölzl, J., Heller, A.: Three chapters of measure theory in Isabelle/HOL. In: Eekelen, M., Geuvers, H., Schmaltz, J., Wiedijk, F. (eds.) ITP 2011. LNCS, vol. 6898, pp. 135–151. Springer, Heidelberg (2011). 10.1007/978-3-642-22863-6_12
Immler, F., Maletzky, A.: Gröbner bases theory. Archive of Formal Proofs (2016). http://isa-afp.org/entries/Groebner_Bases.shtml. Formal proof development
Kam, R.: Case studies in proof checking. Master’s thesis, San Jose State University (2007). http://scholarworks.sjsu.edu/cgi/viewcontent.cgi?context=etd_projects &article=1149
Kobayashi, H., Chen, L., Murao, H.: Groups, rings and modules. Archive of Formal Proofs (2004). http://isa-afp.org/entries/Group-Ring-Module.shtml. Formal proof development
Liu, L., Aravantinos, V., Hasan, O., Tahar, S.: On the formal analysis of HMM using theorem proving. In: Merz, S., Pang, J. (eds.) ICFEM 2014. LNCS, vol. 8829, pp. 316–331. Springer, Cham (2014). 10.1007/978-3-319-11737-9_21
Lotz, M.: On the volume of tubular neighborhoods of real algebraic varieties. Proc. Amer. Math. Soc. 143(5), 1875–1889 (2015)
Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). 10.1007/978-3-540-78800-3_24
Murphy, T., Gray, P., Stewart, G.: Certified convergent perceptron learning (unpublished draft). http://oucsace.cs.ohiou.edu/~gstewart/papers/coqperceptron.pdf
Nipkow, T., Wenzel, M., Paulson, L.C. (eds.): Isabelle/HOL. LNCS, vol. 2283. Springer, Heidelberg (2002)
Paulson, L.C., Blanchette, J.C.: Three years of experience with Sledgehammer, a practical link between automatic and interactive theorem provers. In: Sutcliffe, G., Schulz, S., Ternovska, E. (eds.) IWIL-2010. EPiC, vol. 2, pp. 1–11. EasyChair (2012)
Paulson, L.C., Susanto, K.W.: Source-level proof reconstruction for interactive theorem proving. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007. LNCS, vol. 4732, pp. 232–245. Springer, Heidelberg (2007). 10.1007/978-3-540-74591-4_18
Poon, H., Domingos, P.M.: Sum-product networks: a new deep architecture. In: Cozman, F.G., Pfeffer, A. (eds.) UAI 2011, pp. 337–346. AUAI Press (2011)
Prathamesh, T.V.H.: Tensor product of matrices. Archive of Formal Proofs (2016). http://isa-afp.org/entries/Matrix_Tensor.shtml. Formal proof development
Sternagel, C., Thiemann, R.: Executable multivariate polynomials. Archive of Formal Proofs (2010). http://isa-afp.org/entries/Polynomials.shtml. Formal proof development
Thiemann, R., Yamada, A.: Matrices, Jordan normal forms, and spectral radius theory. Archive of Formal Proofs (2015). http://isa-afp.org/entries/Jordan_Normal_Form.shtml. Formal proof development
Wenzel, M.: Isar—a generic interpretative approach to readable formal proof documents. In: Bertot, Y., Dowek, G., Théry, L., Hirschowitz, A., Paulin, C. (eds.) TPHOLs 1999. LNCS, vol. 1690, pp. 167–183. Springer, Heidelberg (1999). 10.1007/3-540-48256-3_12
Acknowledgment
We thank Lukas Bentkamp, Robert Lewis, Anders Schlichtkrull, Mark Summerfield, and the anonymous reviewers for suggesting many textual improvements. The work has received funding from the European Research Council under the European Union’s Horizon 2020 research and innovation program (grant agreement No. 713999, Matryoshka).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Bentkamp, A., Blanchette, J.C., Klakow, D. (2017). A Formal Proof of the Expressiveness of Deep Learning. In: Ayala-Rincón, M., Muñoz, C.A. (eds) Interactive Theorem Proving. ITP 2017. Lecture Notes in Computer Science(), vol 10499. Springer, Cham. https://doi.org/10.1007/978-3-319-66107-0_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-66107-0_4
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66106-3
Online ISBN: 978-3-319-66107-0
eBook Packages: Computer ScienceComputer Science (R0)