Skip to main content

An Overview of Numerical Acceleration Techniques for Nonlinear Dimension Reduction

  • Chapter
  • First Online:

Part of the book series: Applied and Numerical Harmonic Analysis ((ANHA))

Abstract

We are living in an increasingly data-dependent world - making sense of large, high-dimensional data sets is an important task for researchers in academia, industry, and government. Techniques from machine learning, namely nonlinear dimension reduction, seek to organize this wealth of data by extracting descriptive features. These techniques, though powerful in their ability to find compact representational forms, are hampered by their high computational costs. In their naive implementation, this prevents them from processing large modern data collections in a reasonable time or with modest computational means. In this summary article we shall discuss some of the important numerical techniques which drastically increase the computational efficiency of these methods while preserving much of their representational power. Specifically, we address random projections, approximate k-nearest neighborhoods, approximate kernel methods, and approximate matrix decomposition methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. A. Andoni, M. Datar, N. Immorlica, V.S. Mirrokni, P. Indyk, Locality-sensitive hashing using stable distributions, in Nearest Neighbor Methods in Learning and Vision: Theory and Practice (2006)

    Google Scholar 

  2. S. Arya, D.M. Mount. Approximate Nearest Neighbor Searching, Proceedings 4th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’93) (1993), pp. 271–280

    Google Scholar 

  3. S. Arya, D.M. Mount, N.S. Netanyahu, R. Silverman, A.Y. Wu, An optimal algorithm for approximate nearest neighbor searching. J. ACM 45, 891–923 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  4. S. Arya, D. Mount, N.S. Netanyahu, R. Silverman, A.Y. Wu, An optimal algorithm for approximate nearest neighbor searching fixed dimensions. J. ACM, 45(6), 891–923 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  5. C.M. Bachmann, T.L. Ainsworth, R.A. Fusina, Exploiting manifold geometry in hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 43(3), 441–454 (2005)

    Article  Google Scholar 

  6. R.G. Baraniuk, More is less: signal processing and the data deluge. Science 331(6018), 717–719 (2011)

    Article  Google Scholar 

  7. R. Baraniuk, M. Wakin, Random projections of smooth manifolds. Found. Comput. Math. 9(1), 51–77 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  8. R. Baraniuk, M. Davenport, R. DeVore, M. Wakin. A simple proof of the restricted isometry property for random matrices. Constr. Approx. 28(3), 253–263 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  9. J.S. Beis, D.G. Lowe, Shape indexing using approximate nearest-neighbour search in high-dimensional spaces, in Proceedings., 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1997 (IEEE, New York, 1997), pp. 1000–1006

    Google Scholar 

  10. M. Belabbas, P.J. Wolfe, On landmark selection and sampling in high-dimensional data analysis. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 367(1906), 4295–4312 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  11. M. Belabbas, P.J. Wolfe, Spectral methods in machine learning and new strategies for very large datasets. Proc. Natl. Acad. Sci. 106(2), 369–374 (2009)

    Article  Google Scholar 

  12. M. Belkin, P. Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 1396, 1373–1396 (2003)

    Article  MATH  Google Scholar 

  13. M. Belkin, P. Niyogi, Convergence of Laplacian Eigenmaps, Preprint, 2008

    MATH  Google Scholar 

  14. R.E. Bellman, Adaptive Control Processes: A Guided Tour, vol. 4 (Princeton University Press, Princeton, 1961)

    Book  MATH  Google Scholar 

  15. J.J. Benedetto, W. Czaja, J. Dobrosotskaya, T. Doster, K. Duke, D. Gillis, Integration of heterogeneous data for classification in hyperspectral satellite imagery, in Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XVIII, Proceedings SPIE, vol. 8390 (International Society for Optics and Photonics, Bellingham, 2012), pp. 8390–8378

    Google Scholar 

  16. A. Beygelzimer, S. Kakade, J. Langford, Cover trees for nearest neighbor, in Proceedings of the 23rd International Conference on Machine learning (2006), pp. 97–104

    Google Scholar 

  17. N.D. Cahill, W. Czaja, D.W. Messinger, Schroedinger eigenmaps with nondiagonal potentials for spatial-spectral clustering of hyperspectral imagery, in SPIE Defense+ Security (International Society for Optics and Photonics, Bellingham, 2014), pp. 908804–908804

    Google Scholar 

  18. E. Candes, T. Tao, Decoding via linear programming. IEEE Trans. Inf. Theory 51(12), 4203–4215 (2005)

    Article  MATH  Google Scholar 

  19. E. Candes, T. Tao. Near optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inf. Theory 52(12), 5406–5425 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  20. E. Candes, J. Romberg, T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489–509 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  21. E. Candes, J. Romberg, T. Tao, Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math. 59(8), 1207–1223 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  22. J. Chen, H. Fang, Y. Saad, Fast approximate kNN graph construction for high dimensional data via recursive Lanczos bisection. J. Mach. Learn. Res. 10, 1989–2012 (2009)

    MathSciNet  MATH  Google Scholar 

  23. A. Cheriyadat, L.M. Bruce, Why principal component analysis is not an appropriate feature extraction method for hyperspectral data, in Geoscience and Remote Sensing Symposium, 2003. IGARSS’03. Proceedings. 2003 IEEE International, vol. 6 (IEEE, New York, 2003), pp. 3420–3422

    Google Scholar 

  24. M. Chi, A. Plaza, J. Benediktsson, Z. Sun, J. Shen, Y. Zhu, Big data for remote sensing: challenges and opportunities. Proc. IEEE 104, 2207–2219 (2015)

    Article  Google Scholar 

  25. R.R. Coifman, S. Lafon, A. Lee, M. Maggioni, B. Nadler, F. Warner, S. Zucker, Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc. Natl. Acad. Sci U.S.A. 102(21), 7426–7431 (2005)

    Article  Google Scholar 

  26. T.H. Cormen, C.E. Leiserson, R.L. Rivest, C. Stein, Introduction to Algorithms, 2nd edn. (The MIT Press, Cambridge, 2001)

    MATH  Google Scholar 

  27. W. Czaja, M. Ehler, Schroedinger eigenmaps for the analysis of bio-medical data. IEEE Trans. Pattern Anal. Mach. Intell. 35(5), 1274–1280 (2013)

    Article  Google Scholar 

  28. W. Czaja, A. Hafftka, B. Manning, D. Weinberg. Randomized approximations of operators and their spectral decomposition for diffusion based embeddings of heterogeneous data, in 2015 3rd International Workshop on Compressed Sensing Theory and its Applications to Radar, Sonar and Remote Sensing (CoSeRa) (IEEE, New York, 2015), pp. 75–79

    Google Scholar 

  29. S. Dasgupta, A. Gupta. An elementary proof of the Johnson-Lindenstrauss lemma, Technical Report TR-99-006, Berkeley, CA (1999)

    Google Scholar 

  30. M. Datar, N. Immorlica, P. Indyk, V.S. Mirrokni, Locality-sensitive hashing scheme based on p-stable distributions, in Symposium on Computational Geometry (2004)

    Google Scholar 

  31. V. De Silva, J.B. Tenenbaum, Sparse multidimensional scaling using landmark points. Technical Report, Stanford University (2004)

    Google Scholar 

  32. K. Deb, Multi-Objective Optimization Using Evolutionary Algorithms, vol. 16 (Wiley, New York, 2001)

    MATH  Google Scholar 

  33. L.M. Delves, J.L. Mohamed, Computational Methods for Integral Equations (Cambridge University Press, Cambridge, 1988)

    MATH  Google Scholar 

  34. J. Deng, W. Dong, R. Socher, L. Li, K. Li, L. Fei-Fei, Imagenet: a large-scale hierarchical image database, in IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009 (IEEE, New York, 2009), pp. 248–255

    Google Scholar 

  35. A. Deshpande, L. Rademacher, S. Vempala, G. Wang, Matrix approximation and projective clustering via volume sampling, in Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm (ACM, New York, 2006), pp. 1117–1126

    MATH  Google Scholar 

  36. D.L. Donoho, For most large underdetermined systems of linear equations, the minimal L1-norm solution is also the sparsest solution. Commun. Pure Appl. Math. 59(6), 797–829 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  37. D.L. Donoho, Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  38. D. Donoho, C. Grimes, Hessian eigenmaps: locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. 100(10), 5591–5596 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  39. T. Doster, Harmonic analysis inspired data fusion for applications in remote sensing, PhD thesis, University of Maryland, College Park (2014)

    Google Scholar 

  40. N.R. Draper, H. Smith, Applied Regression Analysis (Wiley, New York, 2014)

    MATH  Google Scholar 

  41. P. Drineas, M.W. Mahoney, On the Nyström method for approximating a gram matrix for improved kernel-based learning. J. Mach. Learn. Res. 6, 2153–2175 (2005)

    MathSciNet  MATH  Google Scholar 

  42. M. Duarte, M. Davenport, D. Takhar, J. Laska, T. Sun, K. Kelly, R. Baraniuk, Single-pixel imaging via compressive sampling. IEEE Signal Process. Mag. 25(2), 83–91 (2008)

    Article  Google Scholar 

  43. E.D. Feigelson, G.J. Babu, Big data in astronomy. Significance 9(4), 22–25 (2012)

    Article  Google Scholar 

  44. C. Fowlkes, S. Belongie, F. Chung, J. Malik, Spectral grouping using the Nyström method. IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 214–225 (2004)

    Article  Google Scholar 

  45. J.H. Freidman, J.L. Bentley, R.A. Finkel, An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3(3), 209–226 (1977)

    Article  MATH  Google Scholar 

  46. G.H. Givens, J.A. Hoeting, Computational Statistics, vol. 710 (Wiley, New York, 2012)

    Book  MATH  Google Scholar 

  47. Y. Goldberg, A. Zakai, D. Kushnir, Y. Ritov, Manifold learning: The price of normalization. J. Mach. Learn. Res. 9, 1909–1939 (2008)

    MathSciNet  MATH  Google Scholar 

  48. N. Halko, P. Martinsson, J.A. Tropp, Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  49. A. Halevy, Extensions of Laplacian eigenmaps for manifold learning, PhD thesis, University of Maryland, College Park (2011)

    Google Scholar 

  50. X. He, S. Yan, Y. Hu, P. Niyogi, H. Zhang, Face recognition using Laplacianfaces. IEEE Trans. Pattern Anal. Mach. Intell. 27(3), 328–340 (2005)

    Article  Google Scholar 

  51. P. Indyk, R. Motwani, Approximate nearest neighbors: towards removing the curse of dimensionality, in STOC (1998)

    Book  MATH  Google Scholar 

  52. A. Jacobs, The pathologies of big data. Commun. ACM 52(8), 36–44 (2009)

    Article  Google Scholar 

  53. A.K. Jain, Data clustering: 50 years beyond k-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)

    Article  Google Scholar 

  54. W.B. Johnson, J. Lindenstrauss, Extensions of Lipschitz mappings into a Hilbert space, in Conference in modern analysis and probability (New Haven, Conn., 1982), ed. by R. Beals, A. Beck, A. Bellow, et al. Contemporary Mathematics, vol. 26 (American Mathematical Society, Providence, RI, 1984), pp. 189–206

    Google Scholar 

  55. S. Kumar, M. Mohri, A. Talwalkar, Sampling techniques for the Nyström method, in Conference on Artificial Intelligence and Statistics (2009), pp. 304–311

    Google Scholar 

  56. S. Kumar, M. Mohri, A. Talwalkar, Ensemble Nyström method, in Neural Information Processing Systems, vol. 7 (2009), p. 223

    Google Scholar 

  57. E. Kushilevitz, R. Ostrovsky, Y. Rabani, Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM J. Comput. 30(2), 457–474 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  58. T.J. Lane, D. Shukla, K.A. Beauchamp, V.S. Pande, To milliseconds and beyond: challenges in the simulation of protein folding. Curr. Opin. Struct. Biol. 23(1), 58–65 (2013)

    Article  Google Scholar 

  59. Y. LeCun, C. Cortes, C. Burges The MNIST database of handwritten digits, (1998)

    Google Scholar 

  60. Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  61. C. Lee, D.A. Landgrebe, Analyzing high-dimensional multispectral data. IEEE Trans. Geosci. Remote Sens. 31(4), 792–800 (1993)

    Article  Google Scholar 

  62. J.A. Lee, M. Verleysen, Nonlinear Dimensionality Reduction. (Springer, New York, 2007)

    Google Scholar 

  63. M. Lustig, D.L. Donoho, J.M. Pauly, Sparse MRI: the application of compressed sensing for rapid MR imaging. Magn. Reson. Med. 58(6), 1182–1195 (2007)

    Article  Google Scholar 

  64. L. Maaten, G. Hinton, Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)

    MATH  Google Scholar 

  65. V. Marx, Biology: the big challenges of big data. Nature 498(7453), 255–260 (2013)

    Article  Google Scholar 

  66. E.J. Nyström, Über die praktische auflösung von integralgleichungen mit anwendungen auf randwertaufgaben. Acta Math. 54(1), 185–204 (1930)

    Article  MathSciNet  MATH  Google Scholar 

  67. C.C. Olson, T. Doster, A parametric study of unsupervised anomaly detection performance in maritime imagery using manifold learning techniques, in SPIE Defense+ Security (International Society for Optics and Photonics, Bellingham, 2016), pp. 984016–984016

    Google Scholar 

  68. R. Paredes, E. Chavez, K. Figueroa, G. Navarro, Practical construction of k-nearest neighbor graphs in metric spaces, in Proceeding of 5th Workshop on Efficient and Experimental Algorithms (2006)

    MATH  Google Scholar 

  69. K. Pearson, On lines and planes of closest fit to systems of points in space. London, Edinburgh, Dublin Philos. Mag. J. Sci. 2(11), 559–572 (1901)

    Google Scholar 

  70. R. Ramakrishnan, P.O. Dral, M. Rupp, O.A. Lilienfeld, Big data meets quantum chemistry approximations: the δ-machine learning approach. J. Chem. Theory Comput. 11(5), 2087–2096 (2015)

    Article  Google Scholar 

  71. V. Rokhlin, A. Szlam, M. Tygert, A randomized algorithm for principal component analysis. SIAM J. Matrix Anal. Appl. 31(3), 1100–1124 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  72. S. Roweis, L. Saul, Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  73. E.E. Schadt, M.D. Linderman, J. Sorenson, L. Lee, G.P. Nolan, Computational solutions to large-scale data management and analysis. Nat. Rev. Genet. 11(9), 647–657 (2010)

    Article  Google Scholar 

  74. B. Scholkopf, S. Mika, Chris J. C. Burges, P. Knirsch, K. Muller, Gu. Ratsch, A.J. Smola, Input space versus feature space in kernel-based methods. IEEE Trans. Neural Netw. 10(5), 1000–1017 (1999)

    Google Scholar 

  75. B. Scholkopf, A.J. Smola, K. Müller, Kernel principal component analysis, in Advances in Kernel Methods-Support Vector Learning (1999)

    Google Scholar 

  76. D.W. Scott, J.R. Thompson, Probability density estimation in higher dimensions. in Computer Science and Statistics: Proceedings of the Fifteenth Symposium on the Interface, vol. 528 (1983), pp. 173–179

    Google Scholar 

  77. A. Smola, B. Scholkopf, Sparse greedy matrix approximation for machine learning, in International Conference on Machine Learning (2000), pp. 911–918

    Google Scholar 

  78. H. Steinhaus, Sur la division des corps matériels en parties. Bull. Acad. Polon. Sci. 4(12), 801–804 (1957)

    MathSciNet  MATH  Google Scholar 

  79. W. Sun, A. Halevy, J. J. Benedetto, W. Czaja, C. Liu, H. Wu, B. Shi, W. Li, UL-isomap based nonlinear dimensionality reduction for hyperspectral imagery classification. ISPRS J. Photogramm. Remote Sens. 89, 25–36 (2014)

    Article  Google Scholar 

  80. W. Sun, A. Halevy, J. J. Benedetto, W. Czaja, W. Li, C. Liu, B. Shi, R. Wang, Nonlinear dimensionality reduction via the ENH-LTSA method for hyperspectral image classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 7(2), 375–388 (2014)

    Article  Google Scholar 

  81. R.S. Sutton, A.G. Barto, Introduction to Reinforcement Learning, vol. 135 (MIT Press, Cambridge, 1998)

    Google Scholar 

  82. D. Takhar, J. Laska, M. Wakin, M. Duarte, D. Baron, S. Sarvotham, K. Kelly, R. Baraniuk, A new compressive imaging camera architecture using optical-domain compression, in Computational Imaging IV at SPIE Electronic Imaging, San Jose, California (2006)

    Google Scholar 

  83. A. Talwalkar, S. Kumar, H. Rowley, Large-scale manifold learning, in IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008 (IEEE, New York, 2008), pp. 1–8

    Google Scholar 

  84. R. Talmon, S. Mallat, H. Zaveri, R.R. Coifman, Manifold learning for latent variable inference in dynamical systems. IEEE Trans. Signal Process. 63(15), 3843–3856 (2015)

    Article  MathSciNet  Google Scholar 

  85. C.K.I. Williams, M. Seeger, Using the Nyström method to speed up kernel machines, in Advances in Neural Information Processing Systems (2001), pp. 682–688

    Google Scholar 

  86. K. Zhang, J.T. Kwok, Density-weighted Nyström method for computing large kernel eigensystems. Neural Comput. 21(1), 121–146 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  87. Z. Zhang, H. Zha, Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J. Sci. Comput. 26, 313–338 (2002)

    Article  MATH  Google Scholar 

  88. K. Zhang, I.W. Tsang, J.T. Kwok, Improved Nyström low-rank approximation and error analysis. in Proceedings of the 25th international conference on Machine learning (ACM, New York, 2008), pp. 1232–1239

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by Defense Threat Reduction Agency grant HDTRA1-13-1-0015 and by Army Research Office grant W911NF1610008.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wojciech Czaja .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Czaja, W., Doster, T., Halevy, A. (2017). An Overview of Numerical Acceleration Techniques for Nonlinear Dimension Reduction. In: Pesenson, I., Le Gia, Q., Mayeli, A., Mhaskar, H., Zhou, DX. (eds) Recent Applications of Harmonic Analysis to Function Spaces, Differential Equations, and Data Science. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-55556-0_12

Download citation

Publish with us

Policies and ethics