Skip to main content

Abstract

In applications, a high-dimensional data is given as a discrete set in a Euclidean space. If the points of data are well sampled on a manifold, then the data geometry is inherited from the manifold. Since the underlying manifold is hidden, it is hard to know its geometry by the classical manifold calculus. The data graph is a useful tool to reveal the data geometry. To construct a data graph, we first find the neighborhood system on the data, which is determined by the similarity (or dissimilarity) among the data points. The similarity information of data usually is driven by the application in which the data are used. In this chapter, we introduce the methods for defining the data similarity (or dissimilarity). We also introduce the preliminary spectral graph theory to analyze the data geometry. In Section 1, the construction of neighborhood system on data is discussed. The neighborhood system on a data set defines a data graph, which can be considered as a discrete form of a manifold. In Section 2, we introduce the basic concepts of graphs. In Section 3, the spectral graph analysis is introduced as a tool for analyzing the data geometry. Particularly, the Laplacian on a graph is briefly discussed in this section. Most of the materials in Sections 2 and 3 are found in [1–3].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bondy, J., Murty, U.: Graph Theory. Springer (2008).

    Google Scholar 

  2. Chartrand, G.: Introductory Graph Theory. Dover (1985).

    Google Scholar 

  3. Chung, F.R.: Spectral Graph Theory. CBMS Regional Conference Series in Mathematics, No. 9. AMS (1996).

    Google Scholar 

  4. Shakhnarovich, G., Darrell, T., Indyk, P. (eds.): Nearest-Neighbor Methods in Learning and Vision, Theory and Practice. MIT (2006).

    Google Scholar 

  5. Bachmann, C.M., Ainsworth, T.L., Fusina, R.A.: Improved manifold coordinate representations of large-scale hyperspectral scenes. IEEE Trans. Geo. Remote Sensing 44, 2786–2803 (2006).

    Article  Google Scholar 

  6. Bozkaya, T., Ozsoyoghu, M.: Distance-based indexing for highdimensional metric spaces. In: Proc. ACM SIGMOD, p. 357–368 (1997).

    Google Scholar 

  7. Friedman, J.H., Bentley, J.L., Finkel, R.A.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3(3), 209–226 (1977).

    Article  MATH  Google Scholar 

  8. Katayama, N., Satoh, S.: The SR-tree: An index structure for high-dimensional nearest neighbor queries. Proc. ACM SIGMOD p. 369–380 (1997).

    Google Scholar 

  9. Kim, B.S., Park, S.B.: Fast nearest neighbor finding algorithm based on ordered partition. IEEE Trans. Pattern Anal. Mach. Intell. 8(6), 761–766 (1986).

    Article  MATH  Google Scholar 

  10. Lubiarz, S., Lockwood, P.: Evaluation of fast algorithms for finding the nearest neighbor. Proc. IEEE Int. Conf. Acoust., Speechand Signal Process. 2, 1491–1494 (1997).

    Google Scholar 

  11. McNames, J.: A fast nearest-neighbor alogorithm based on a principal axis search tree. IEEE Trans. Pattern Anal. Mach. Intell. 23(9), 964–976 (2001).

    Article  Google Scholar 

  12. Yianilos, P.N.: Data structure and algorithms for nearest neighbor search in general metric spaces. Proc. ACM-SIAMSymp. Discr. Algorithms p. 311–321 (1993).

    Google Scholar 

  13. Chui, C.K.: An Introduction to Wavelets, Wavelet Analysis and its Applications, vol. 1. Academic Press, Inc. (1992).

    Google Scholar 

  14. Chui, C.K., Wang, J.Z.: A cardinal spline approach to wavelets. Proc. Amer. Math. Soc. 113, 785–793 (1991).

    Article  MathSciNet  MATH  Google Scholar 

  15. Chui, C.K., Wang, J.Z.: On compactly supported spline wavelets and a duality principle. Trans. Amer. Math. Soc. 330, 903–915 (1992).

    Article  MathSciNet  MATH  Google Scholar 

  16. Chui, C.K.: Wavelets: A Mathematical Tool for Signal Analysis. SIAMMonographs on Mathematical Modeling and Computation. Society for Industrial and Applied Mathematics, Philadelphia (1997).

    Google Scholar 

  17. Chui, C.K., Wang, J.Z.: A general framework of compactly supported splines and wavelets. J. Approx. Theory 71(3), 263–304 (1992).

    Article  MathSciNet  MATH  Google Scholar 

  18. Laub, J., Müller, K.R.: Feature discovery in non-metric pairwise data. Journal of Machine Learning Research 5, 801–818 (2004).

    MATH  Google Scholar 

  19. Mahalanobis, P.C.: On the generalised distance in statistics. Proceedings of the National Institute of Sciences of India 2(1), 49–55 (1936).

    MathSciNet  MATH  Google Scholar 

  20. De Maesschalck, R., Jouan-Rimbaud, D., Massart, D.: The mahalanobis distance. Chemometrics and Intelligent Laboratory Systems 50, 1–8 (2000).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Higher Education Press, Beijing and Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Wang, J. (2012). Geometric Structure of High-Dimensional Data. In: Geometric Structure of High-Dimensional Data and Dimensionality Reduction. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27497-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27497-8_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-27496-1

  • Online ISBN: 978-3-642-27497-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics