ICANN 98 pp 541-546 | Cite as

Sparse Regression: Utilizing the Higher-order Structure of Data for Prediction

  • Aapo Hyvärinen
Part of the Perspectives in Neural Computing book series (PERSPECT.NEURAL)


Independent component analysis and the closely related method of sparse coding model multidimensional data as linear combinations of independent components that have nongaussian, usually sparse, distributions. Such a modelling approach is especially suitable in large dimensions, as it avoids the curse of dimensionality. It also seems to represent important properties of sensory data. In this paper we show how to use these models for regression. If the joint density of two random vectors is modelled by independent component analysis, it is possible to obtain simple algorithms to compute the maximum likelihood predictor of one of the vectors when the other vector is observed. The obtained predictors are nonlinear, but in contrast to such nonparametric methods as MLP, the nonlinearities are not chosen ad hoc: They are directly determined by the density approximation.


Independent Component Analysis Independent Component Independent Component Analysis Sparse Code Joint Density 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    H.B. Barlow. Unsupervised learning. Neural Computation, 1:295–311, 1989.CrossRefGoogle Scholar
  2. [2]
    A.J. Bell and T.J. Sejnowski. The’ independent components’ of natural scenes are edge filters. Vision Research, 37:3327–3338, 1997.CrossRefGoogle Scholar
  3. [3]
    P. Comon. Independent component analysis — a new concept? Signal Processing, 36:287–314, 1994.MATHCrossRefGoogle Scholar
  4. [4]
    P.J. Huber. Projection pursuit. The Annals of Statistics, 13(2): 435–475, 1985.MathSciNetMATHCrossRefGoogle Scholar
  5. [5]
    A. Hyvärinen. Sparse code shrinkage: Denoising of nongaussian data by maximum likelihood estimation. Technical report, Helsinki University of Technology, Laboratory of Computer and Information Science, 1998.Google Scholar
  6. [6]
    A. Hyvärinen, P. Hoyer, and E. Oja. Sparse code shrinkage for image denoising. In Proc. IEEE Int. Joint Conf. on Neural Networks, pages 859–864, Anchorage, Alaska, 1998.Google Scholar
  7. [7]
    A. Hyvärinen and E. Oja. A fast fixed-point algorithm for independent component analysis. Neural Computation, 9(7):1483–1492, 1997.CrossRefGoogle Scholar
  8. [8]
    C. Jutten and J. Hérault. Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture. Signal Processing, 24:1–10, 1991.MATHCrossRefGoogle Scholar
  9. [9]
    B.A. Olshausen and D.J. Field. Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37:3311–3325, 1997.CrossRefGoogle Scholar
  10. [10]
    D.-T. Pham, P. Garrat, and C. Jutten. Separation of a mixture of independent sources through a maximum likelihood approach. In Proc. EUSIPCO, pages 771–774, 1992.Google Scholar

Copyright information

© Springer-Verlag London 1998

Authors and Affiliations

  • Aapo Hyvärinen
    • 1
  1. 1.Laboratory of Computer and Information ScienceHelsinki University of TechnologyFinland

Personalised recommendations