Abstract
In many vision problems, the observed data lies in a nonlinear manifold in a high-dimensional space. This paper presents a generic modelling scheme to characterize the nonlinear structure of the manifold and to learn its multimodal distribution. Our approach represents the data as a linear combination of parameterized local components, where the statistics of the component parameterization describe the nonlinear structure of the manifold. The components are adaptively selected from the training data through a progressive density approximation procedure, which leads to the maximum likelihood estimate of the underlying density. We show results on both synthetic and real training sets, and demonstrate that the proposed scheme has the ability to reveal important structures of the data.
Chapter PDF
References
B. Moghaddam and A. Pentland, “Probabilistic Visual Learning for Object Representation”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, Jul. 1997, pp. 696–710.
M. Tipping and C. Bishop. “Probabilistic Principal Component Analysis”. Technical Report NCRG/97/010, Neural Computing Research Group, Aston University, September 1997.
R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis, Prentice Hall, NJ, 1992.
J. Ng and S. Gong, “Multi-view Face Detection and Pose Estimation Using A Composite Support Vector Machine Across the View Sphere”, RATFG-RTS, 1999, pp. 14–21.
N. Kambhatla and T. K. Leen, “Dimension Reduction by Local PCA”, Neural Computation, vol. 9, no. 7, Oct. 1997, pp. 1493–1516.
B. Chalmond and S. C. Girard, “Nonlinear Modeling of Scattered Multivariate Data and Its Application to Shape Change”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 5, May 1999, pp. 422–434.
B. Moghaddam, “Principal Manifolds and Bayesian Subspaces for Visual Recognition”, IEEE Int. Conf. on Computer Vision, 1999, pp. 1131–1136.
J. O. Ramsay and X. Li, “Curve Registration”, J. R. Statist. Soc., Series B, vol. 60, 1998, pp. 351–363.
G. James and T. Hastie, “Principal Component Models for Sparse Functional Data”, Technical Report, Department of Statistics, Stanford University, 1999.
M. Black, and Y. Yacoob, “Tracking and Recognizing Rigid and Non-Rigid Facial Motions Using Local Parametric Models of Image Motion”, IEEE Int. Conf. Computer Vision, 1995, pp. 374–381.
Z. R. Yang and M. Zwolinski, “Mutual Information Theory for Adaptive Mixture Models”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 4, Apr. 2001, pp. 396–403.
A. P. Dempster, N. M. Laird and D. B. Rubin, “Maximum Likelihood from Incomplete Data via EM Algorithm”, J. R. Statist. Soc., Series B, vol. 39, 1977, pp. 1–38.
H. Murase and S. K. Nayar, “Visual Learning and Recognition of 3D Objects from Appearance”, Int. J. Computer Vision, vol. 14, 1995, pp. 5–24.
Q. Zhang and A. Benveniste, “Wavelet Networks”, IEEE Trans. Neural Networks, vol. 3, no. 6, Nov 1992, pp. 889–898.
C. M. Bishop and J. M. Winn, “Non-linear Bayesian Image Modelling”, European Conf. on Computer Vision, 2000, pp. 3–17.
B. Frey and N. Jojic, “Transformed Component Analysis: Joint Estimation of Spatial Transformations and image Components”, IEEE Int. Conf. Computer Vision, 1999, pp. 1190–1196.
M. Weber, M. Welling and P. Perona, “Unsupervised Learning of Models for Recognition”, European Conf. on Computer Vision, 2000, pp. 18–32.
T. S. Lee, “Image Representation Using 2D Gabor Wavelets”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 10, 1996, pp. 959–971.
B. W. Silverman, “Incorporating Parametric Effects into Functional Principal Components Analysis, J. R. Statist. Soc., Series B, vol. 57, no. 4, 1995, pp. 673–689.
M. Black, and A. Jepson, “Eigentracking: Robust Matching and Tracking of Articulated Objects Using A View-based Representation”, European Conf. on Computer Vision, 1996, pp. 329–342.
A. R. Gallant, Nonlinear Statistical Models, John Wiley & Sons Inc., NY, 1987.
J. B. Tenenbaum, V. De Silva, and J. C. Langford, “A Global Geometric Framework for Nonlinear Dimensionality Reduction”, Science, vol. 290, 2000, pp. 2319–2323.
S. Roweis and L. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding”, Science, vol. 290, 2000, pp. 2323–2326.
B. J. Frey and N. Jojic, “Transformation-Invariant Clustering and Dimensionality Reduction Using EM”, submitted to IEEE Trans. Pattern Analysis and Machine Intelligence, Nov. 2000.
C. Scott and R. Nowak, “Template Learning from Atomic Representations: A Wavelet-based Approach to Pattern Analysis”, IEEE workshop on Statistical and Computational Theories of Vision, Vancouver, CA, July 2001.
Y. Zhu, D. Comaniciu, Visvanathan Ramesh and Stuart Schwartz, “Parametric Representations for Nonlinear Modeling of Visual Data”, IEEE Int. Conf. on Computer Vision and Pattern Recognition, 2001, pp. 553–560.
K. Popat and R. W. Picard, “Cluster Based Probability Model and Its Application to Image and Texture Processing”, IEEE Trans. Image Processing, Vol. 6, No. 2, 1997, pp. 268–284.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhu, Y., Comaniciu, D., Schwartz, S., Ramesh, V. (2002). Multimodal Data Representations with Parameterized Local Structures. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds) Computer Vision — ECCV 2002. ECCV 2002. Lecture Notes in Computer Science, vol 2350. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47969-4_12
Download citation
DOI: https://doi.org/10.1007/3-540-47969-4_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43745-1
Online ISBN: 978-3-540-47969-7
eBook Packages: Springer Book Archive