Abstract
Spectacular advances in sensor technology, data storage devices, and large-scale computing are producing huge data sets. These large and high-dimensional sets arise naturally in a variety of contexts such as the dynamics of the Internet, imaging for surveillance and diagnostics, and gene sequencing. The significant change in the scale and complexity embodied in these types of data, as well as the intricacies of the underlying phenomena being studied, present some new conceptual challenges. There has been considerable research activity dealing with the organization and analysis of such large data sets. But, by and large, these approaches have had only limited success towards the goal of understanding fully the inherent structures of these large data sets. There is a need, therefore, for new fundamental thinking about these problems and new mathematical approaches. In this paper we review a few such promising directions that draw extensively from fertile areas of harmonic analysis, discrete mathematics, stochastic analysis, and statistical methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
R. Bellman, Adaptive Control Processes: A Guided Tour, Princeton University Press, 1961.
J. Chandra, Integration of data fusion and network control, Proc. Workshop on Data Compression Processing Techniques for Missile Guidance Data Links, (Keynote Address), 411–440, 1998.
P. Diaconis and, L. Saloff-Coste, What do we know about Metropolis algorithms, Proc. 27th Annual ACM Symposium on Theory of Computing, 112–129, 1995.
D. Donoho, High-dimensional data analysis, Amer. Math. Soc. Symposium on Mathematical Challenges of 21st Century, August 2000.
B. Effron, Jackknife, Bootstrap, and other Resampling Plans, SIAM Publications, 1982.
S. Mallat, A Wavelet Tour of Signal Processing, Academic Press, 1998.
V. Milman, The heritage of P. Levy in geometrical functional analysis, Asterisque, 157/158,273–301, 1988.
F. Murtagh, Wedding the wavelet transform and multivariate data analysis, Jour. Classification, 15, 161–183, 1998.
J. Starck, F. Murtagh, and A. Bijaoui, Image and Data Analysis: The Multi-scale Approach, Cambridge University Press, 1998.
G. Strang, and T. Nguyen, Wavelets and Filter Banks, Wellesley- Cambridge Press, 1996.
M. Talagrand, A new look at independence, Annals of Probability, 24, 1–34, 1996.
J. Tukey, Exploratory Data Analysis, Addison Wesley, 1977.
V. Vapnik, The Nature of Statistical Learning Theory, Springer-Verlag, 2000.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Chandra, J. (2001). Understanding High Dimensional and Large Data Sets: Some Mathematical Challenges and Opportunities. In: Grossman, R.L., Kamath, C., Kegelmeyer, P., Kumar, V., Namburu, R.R. (eds) Data Mining for Scientific and Engineering Applications. Massive Computing, vol 2. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-1733-7_2
Download citation
DOI: https://doi.org/10.1007/978-1-4615-1733-7_2
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4020-0114-7
Online ISBN: 978-1-4615-1733-7
eBook Packages: Springer Book Archive