Skip to main content

Part of the book series: Springer Series in Statistics ((SSS))

  • 9764 Accesses

A typical data set can be represented as a collection of n vectors x = (x 1,⃛,x p ) each of length p. They are usually modeled as IID outcomes of a single random variable X = (X 1,⃛,X p ) . Classical data sets had small values of p and small to medium values of n, with p<n. Currently emerging data sets are much more complicated and diverse: The sample size may be so large that a mean cannot be calculated in real time. The dimension p may be so large that no realistic sample size will ever be obtained. The X may summarize a waveform, a graph with many edges and vertices, an image, or a document. Often data sets are multitype, meaning they combine qualitatively different classes of data. In all these cases, and many others, the complexity of the data – to say nothing of the model – is so great that inference becomes effectively impossible.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bertrand Clarke .

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag New York

About this chapter

Cite this chapter

Clarke, B., Fokoué, E., Zhang, H.H. (2009). Learning in High Dimensions. In: Principles and Theory for Data Mining and Machine Learning. Springer Series in Statistics. Springer, New York, NY. https://doi.org/10.1007/978-0-387-98135-2_9

Download citation

Publish with us

Policies and ethics