Classification Scheme for Binary Data with Extensions

  • Denali Molitor
  • Deanna Needell
  • Aaron NelsonEmail author
  • Rayan Saab
  • Palina Salanevich
Part of the Applied and Numerical Harmonic Analysis book series (ANHA)


In this chapter, we present a simple classification scheme that utilizes only 1-bit measurements of the training and testing data. Our method is intended to be efficient in terms of computation and storage while also allowing for a rigorous mathematical analysis. After providing some motivation, we present our method and analyze its performance for a simple data model. We also discuss extensions of the method to the hierarchical data setting, and include some further implementation considerations. Experimental evidence provided in this chapter demonstrates that our methods yield accurate classification on a variety of synthetic and real data.



Molitor and Needell were partially supported by NSF CAREER grant \(\#1348721\) and NSF BIGDATA \(\#1740325\). Saab was partially supported by the NSF under DMS-1517204.


  1. 1.
    N. Ailon, B. Chazelle, Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform, in Proceedings of the Thirty-Eighth Annual ACM Symposium on Theory of Computing (ACM, 2006), pp. 557–563Google Scholar
  2. 2.
    D. Achlioptas, Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. Syst. Sci. 66(4), 671–687 (2003)MathSciNetCrossRefGoogle Scholar
  3. 3.
    A.M. Andrew, An introduction to support vector machines and other kernel-based learning methods by Nello Christianini and John Shawe-Taylor (Cambridge University Press, Cambridge, 2000), xiii+ pp. 189, ISBN 0-521-78019-5 (hbk, £ 27.50)Google Scholar
  4. 4.
    P. Boufounos, R. Baraniuk, 1-bit compressive sensing, in Proceedings of IEEE Conference on Information, Science and Systems (CISS) (Princeton, NJ, 2008)Google Scholar
  5. 5.
    R. Baraniuk, M. Davenport, R. DeVore, M. Wakin, The Johnson-Lindenstrauss lemma meets compressed sensing. Preprint 100(1) (2006)Google Scholar
  6. 6.
    R. Baraniuk, S. Foucart, D. Needell, Y. Plan, M. Wootters, Exponential decay of reconstruction error from binary measurements of sparse signals. IEEE Trans. Inf. Theory 63(6), 3368–3385 (2017)MathSciNetCrossRefGoogle Scholar
  7. 7.
    A. Choromanska, K. Choromanski, M. Bojarski, T. Jebara, S. Kumar, Y. LeCun, Binary embeddings with structured hashed projections, in Proceedings of The 33rd International Conference on Machine Learning (2016), pp. 344–353Google Scholar
  8. 8.
    D. Cai, X. He, J. Han, Spectral regression for efficient regularized subspace learning, in Proceedings of International Conference on Computer Vision (ICCV’07) (2007)Google Scholar
  9. 9.
    D. Cai, X. He, Y. Hu, J. Han, T. Huang, Learning a spatially smooth subspace for face recognition, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Machine Learning (CVPR’07) (2007)Google Scholar
  10. 10.
    D. Cai, X. He, J. Han, H.-J. Zhang, Orthogonal laplacianfaces for face recognition. IEEE Trans. Image Process. 15(11), 3608–3614 (2006)CrossRefGoogle Scholar
  11. 11.
    S. Cheong, S.H. Oh, S.-Y. Lee, Support vector machines with binary tree architecture for multi-class classification. Neural Inf. Process. Lett. Rev. 2(3), 47–51 (2004)Google Scholar
  12. 12.
    E. Candès, J. Romberg, T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489–509 (2006)MathSciNetCrossRefGoogle Scholar
  13. 13.
    E. Candès, J. Romberg, T. Tao, Stable signal recovery from incomplete and inaccurate measurements. Comm. Pure Appl. Math. 59(8), 1207–1223 (2006)MathSciNetCrossRefGoogle Scholar
  14. 14.
    N. Christianini, J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods (Cambridge University Press, Cambridge, England, 2000)CrossRefGoogle Scholar
  15. 15.
    S. Dasgupta, A. Gupta, An elementary proof of a theorem of Johnson and Lindenstrauss. Random Struct. Algorithms 22(1), 60–65 (2003)MathSciNetCrossRefGoogle Scholar
  16. 16.
    D. Donoho, Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)MathSciNetCrossRefGoogle Scholar
  17. 17.
    S. Dirksen, A. Stollenwerk, Fast binary embeddings with gaussian circulant matrices: improved bounds. arXiv preprint arXiv:1608.06498 (2016)
  18. 18.
    D. Duncan, T. Strohmer, Classification of alzheimer’s disease using unsupervised diffusion component analysis. Math. Biosci. Eng. 13, 1119–1130 (2016)MathSciNetCrossRefGoogle Scholar
  19. 19.
    J. Fang, Y. Shen, H. Li, Z. Ren, Sparse signal recovery from one-bit quantized data: an iterative reweighted algorithm. Signal Process. 102, 201–206 (2014)CrossRefGoogle Scholar
  20. 20.
    Y. Gong, S. Lazebnik, A. Gordo, F. Perronnin, Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2916–2929 (2013)CrossRefGoogle Scholar
  21. 21.
    S. Gopi, P. Netrapalli, P. Jain, A.V. Nori, One-bit compressed sensing: provable support and vector recovery. ICML 3, 154–162 (2013)Google Scholar
  22. 22.
    A. Gupta, R. Nowak, B. Recht, Sample complexity for 1-bit compressed sensing and sparse classification, in 2010 IEEE International Symposium on Information Theory Proceedings (ISIT) (IEEE, 2010), pp. 1553–1557Google Scholar
  23. 23.
    A.D. Gordon, A review of hierarchical classification. J. R. Stat. Soc. Ser. A Gen. 119–137 (1987)MathSciNetCrossRefGoogle Scholar
  24. 24.
    M.A. Hearst, S.T. Dumais, E. Osuna, J. Platt, B. Scholkopf, Support vector machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998)CrossRefGoogle Scholar
  25. 25.
    R. Higdon, N.L. Foster, R.A. Koeppe, C.S. DeCarli, W.J. Jagust, C.M. Clark, N.R. Barbas, S.E. Arnold, R.S. Turner, J.L. Heidebrink et al., A comparison of classification methods for differentiating fronto-temporal dementia from alzheimer’s disease using FDG-PET imaging. Stat. Med. 23(2), 315–326 (2004)CrossRefGoogle Scholar
  26. 26.
    J. Hahn, S. Rosenkranz, A.M. Zoubir, Adaptive compressed classification for hyperspectral imagery, in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, 2014), pp. 1020–1024Google Scholar
  27. 27.
    B. Hunter, T. Strohmer, T.E. Simos, G. Psihoyios, C. Tsitouras, Compressive spectral clustering, in AIP Conference Proceedings, vol. 1281 (AIP, 2010), pp. 1720–1722Google Scholar
  28. 28.
    X. He, S. Yan, Y. Hu, P. Niyogi, H.-J. Zhang, Face recognition using laplacianfaces. IEEE Trans. Pattern Anal. Mach. Intell. 27(3), 328–340 (2005)CrossRefGoogle Scholar
  29. 29.
    L. Jacques, K. Degraux, C. De Vleeschouwer, Quantized iterative hard thresholding: bridging 1-bit and high-resolution quantized compressed sensing. arXiv preprint arXiv:1305.1786 (2013)
  30. 30.
    W. Johnson, J. Lindenstrauss, Extensions of Lipschitz mappings into a Hilbert space, in Proceedings of Conference in Modern Analysis and Probability (New Haven, CT, 1982)Google Scholar
  31. 31.
    L. Jacques, J. Laska, P. Boufounos, R. Baraniuk, Robust 1-bit compressive sensing via binary stable embeddings of sparse vectors. IEEE Trans. Inf. Theory 59(4), 2082–2102 (2013)MathSciNetCrossRefGoogle Scholar
  32. 32.
    T. Joachims, Text categorization with support vector machines: learning with many relevant features, in Machine Learning: ECML-98 (1998), pp. 137–142Google Scholar
  33. 33.
    A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems (2012), pp. 1097–1105Google Scholar
  34. 34.
    K. Knudson, R. Saab, R. Ward, One-bit compressive sensing with norm estimation. IEEE Trans. Inf. Theory 62(5), 2748–2758 (2016)MathSciNetCrossRefGoogle Scholar
  35. 35.
    F. Krahmer, R. Ward, New and improved Johnson-Lindenstrauss embeddings via the restricted isometry property. SIAM J. Math. Anal. 43(3), 1269–1281 (2011)MathSciNetCrossRefGoogle Scholar
  36. 36.
    Y. LeCun, The MNIST database of handwritten digits.
  37. 37.
    J.N. Laska, Z. Wen, W. Yin, R.G. Baraniuk, Trust, but verify: fast and accurate signal recovery from 1-bit compressive measurements. IEEE Trans. Signal Process. 59(11), 5289–5301 (2011)MathSciNetCrossRefGoogle Scholar
  38. 38.
    H. Li, Y. Yang, D. Chen, Z. Lin, Optimization algorithm inspired deep neural network structure design. arXiv preprint arXiv:1810.01638 (2018)
  39. 39.
    D. Molitor, D. Needell, Hierarchical classification using binary data (2018). SubmittedGoogle Scholar
  40. 40.
    G.F. Montufar, R. Pascanu, K. Cho, Y. Bengio, On the number of linear regions of deep neural networks, in Advances in Neural Information Processing Systems (2014), pp. 2924–2932Google Scholar
  41. 41.
    D. Needell, R. Saab, T. Woolf, Simple classification using binary data. J. Mach. Learn. Res. (2017). AcceptedGoogle Scholar
  42. 42.
    Y. Plan, R. Vershynin, One-bit compressed sensing by linear programming. Commun. Pure Appl. Math. 66(8), 1275–1297 (2013)MathSciNetCrossRefGoogle Scholar
  43. 43.
    Y. Plan, R. Vershynin, Robust 1-bit compressed sensing and sparse logistic regression: a convex programming approach. IEEE Trans. Inf. Theory 59(1), 482–494 (2013)MathSciNetCrossRefGoogle Scholar
  44. 44.
    Y. Plan, R. Vershynin, Dimension reduction by random hyperplane tessellations. Discret. Comput. Geom. 51(2), 438–461 (2014)MathSciNetCrossRefGoogle Scholar
  45. 45.
    O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  46. 46.
    M. Rudelson, R. Vershynin, On sparse reconstruction from Fourier and Gaussian measurements. Comm. Pure Appl. Math. 61(8), 1025–1171 (2008)MathSciNetCrossRefGoogle Scholar
  47. 47.
    I. Steinwart, A. Christmann, Support Vector Machines (Springer Science & Business Media, 2008)Google Scholar
  48. 48.
    H.M. Shi, M. Case, X. Gu, S. Tu, D. Needell, Methods for quantized compressed sensing, in Proceedings of Information Theory and Applications (ITA) (2016)Google Scholar
  49. 49.
    C.N. Silla, A.A. Freitas, A survey of hierarchical classification across different application domains. Data Min. Knowl. Discov. 22(1–2), 31–72 (2011)MathSciNetCrossRefGoogle Scholar
  50. 50.
    C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, Going deeper with convolutions, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 1–9Google Scholar
  51. 51.
    K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  52. 52.
    J. Weston, C. Watkins, Multi-class support vector machines. Technical Report (Citeseer, 1998)Google Scholar
  53. 53.
    X. Yi, C. Caravans, E. Price, Binary embedding: fundamental limits and fast algorithm (2015)Google Scholar
  54. 54.
    F.X. Yu, S. Kumar, Y. Gong, S.-F. Chang, Circulant binary embedding, in International Conference on Machine Learning, vol. 6 (2014), p. 7Google Scholar
  55. 55.
    M. Yan, Y. Yang, S. Osher, Robust 1-bit compressive sensing using adaptive outlier pursuit. IEEE Trans. Signal Process. 60(7), 3868–3875 (2012)MathSciNetCrossRefGoogle Scholar
  56. 56.
    Q.-S. Zhang, S.-C. Zhu, Visual interpretability for deep learning: a survey. Front. Inf. Technol. Electron. Eng. 19(1), 27–39 (2018)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Denali Molitor
    • 1
  • Deanna Needell
    • 1
  • Aaron Nelson
    • 2
    Email author
  • Rayan Saab
    • 2
  • Palina Salanevich
    • 1
  1. 1.University of CaliforniaLos AngelesUSA
  2. 2.University of CaliforniaSan DiegoUSA

Personalised recommendations