Multimodal Data Representations with Parameterized Local Structures

Zhu, Ying; Comaniciu, Dorin; Schwartz, Stuart; Ramesh, Visvanathan

doi:10.1007/3-540-47969-4_12

Multimodal Data Representations with Parameterized Local Structures

Ying Zhu⁷,
Dorin Comaniciu⁸,
Stuart Schwartz⁷ &
…
Visvanathan Ramesh⁸

Conference paper
First Online: 01 January 2002

3685 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2350))

Abstract

In many vision problems, the observed data lies in a nonlinear manifold in a high-dimensional space. This paper presents a generic modelling scheme to characterize the nonlinear structure of the manifold and to learn its multimodal distribution. Our approach represents the data as a linear combination of parameterized local components, where the statistics of the component parameterization describe the nonlinear structure of the manifold. The components are adaptively selected from the training data through a progressive density approximation procedure, which leads to the maximum likelihood estimate of the underlying density. We show results on both synthetic and real training sets, and demonstrate that the proposed scheme has the ability to reveal important structures of the data.

Download to read the full chapter text

Chapter PDF

References

B. Moghaddam and A. Pentland, “Probabilistic Visual Learning for Object Representation”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, Jul. 1997, pp. 696–710.
Article Google Scholar
M. Tipping and C. Bishop. “Probabilistic Principal Component Analysis”. Technical Report NCRG/97/010, Neural Computing Research Group, Aston University, September 1997.
Google Scholar
R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis, Prentice Hall, NJ, 1992.
Google Scholar
J. Ng and S. Gong, “Multi-view Face Detection and Pose Estimation Using A Composite Support Vector Machine Across the View Sphere”, RATFG-RTS, 1999, pp. 14–21.
Google Scholar
N. Kambhatla and T. K. Leen, “Dimension Reduction by Local PCA”, Neural Computation, vol. 9, no. 7, Oct. 1997, pp. 1493–1516.
Article Google Scholar
B. Chalmond and S. C. Girard, “Nonlinear Modeling of Scattered Multivariate Data and Its Application to Shape Change”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 5, May 1999, pp. 422–434.
Article Google Scholar
B. Moghaddam, “Principal Manifolds and Bayesian Subspaces for Visual Recognition”, IEEE Int. Conf. on Computer Vision, 1999, pp. 1131–1136.
Google Scholar
J. O. Ramsay and X. Li, “Curve Registration”, J. R. Statist. Soc., Series B, vol. 60, 1998, pp. 351–363.
Google Scholar
G. James and T. Hastie, “Principal Component Models for Sparse Functional Data”, Technical Report, Department of Statistics, Stanford University, 1999.
Google Scholar
M. Black, and Y. Yacoob, “Tracking and Recognizing Rigid and Non-Rigid Facial Motions Using Local Parametric Models of Image Motion”, IEEE Int. Conf. Computer Vision, 1995, pp. 374–381.
Google Scholar
Z. R. Yang and M. Zwolinski, “Mutual Information Theory for Adaptive Mixture Models”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 4, Apr. 2001, pp. 396–403.
Article Google Scholar
A. P. Dempster, N. M. Laird and D. B. Rubin, “Maximum Likelihood from Incomplete Data via EM Algorithm”, J. R. Statist. Soc., Series B, vol. 39, 1977, pp. 1–38.
Google Scholar
H. Murase and S. K. Nayar, “Visual Learning and Recognition of 3D Objects from Appearance”, Int. J. Computer Vision, vol. 14, 1995, pp. 5–24.
Article Google Scholar
Q. Zhang and A. Benveniste, “Wavelet Networks”, IEEE Trans. Neural Networks, vol. 3, no. 6, Nov 1992, pp. 889–898.
Article Google Scholar
C. M. Bishop and J. M. Winn, “Non-linear Bayesian Image Modelling”, European Conf. on Computer Vision, 2000, pp. 3–17.
Google Scholar
B. Frey and N. Jojic, “Transformed Component Analysis: Joint Estimation of Spatial Transformations and image Components”, IEEE Int. Conf. Computer Vision, 1999, pp. 1190–1196.
Google Scholar
M. Weber, M. Welling and P. Perona, “Unsupervised Learning of Models for Recognition”, European Conf. on Computer Vision, 2000, pp. 18–32.
Google Scholar
T. S. Lee, “Image Representation Using 2D Gabor Wavelets”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 10, 1996, pp. 959–971.
Article Google Scholar
B. W. Silverman, “Incorporating Parametric Effects into Functional Principal Components Analysis, J. R. Statist. Soc., Series B, vol. 57, no. 4, 1995, pp. 673–689.
Google Scholar
M. Black, and A. Jepson, “Eigentracking: Robust Matching and Tracking of Articulated Objects Using A View-based Representation”, European Conf. on Computer Vision, 1996, pp. 329–342.
Google Scholar
A. R. Gallant, Nonlinear Statistical Models, John Wiley & Sons Inc., NY, 1987.
MATH Google Scholar
J. B. Tenenbaum, V. De Silva, and J. C. Langford, “A Global Geometric Framework for Nonlinear Dimensionality Reduction”, Science, vol. 290, 2000, pp. 2319–2323.
Article Google Scholar
S. Roweis and L. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding”, Science, vol. 290, 2000, pp. 2323–2326.
Article Google Scholar
B. J. Frey and N. Jojic, “Transformation-Invariant Clustering and Dimensionality Reduction Using EM”, submitted to IEEE Trans. Pattern Analysis and Machine Intelligence, Nov. 2000.
Google Scholar
C. Scott and R. Nowak, “Template Learning from Atomic Representations: A Wavelet-based Approach to Pattern Analysis”, IEEE workshop on Statistical and Computational Theories of Vision, Vancouver, CA, July 2001.
Google Scholar
Y. Zhu, D. Comaniciu, Visvanathan Ramesh and Stuart Schwartz, “Parametric Representations for Nonlinear Modeling of Visual Data”, IEEE Int. Conf. on Computer Vision and Pattern Recognition, 2001, pp. 553–560.
Google Scholar
K. Popat and R. W. Picard, “Cluster Based Probability Model and Its Application to Image and Texture Processing”, IEEE Trans. Image Processing, Vol. 6, No. 2, 1997, pp. 268–284.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Princeton University, Princeton, NJ, 08544, USA
Ying Zhu & Stuart Schwartz
Imaging & Visualization Department, Siemens Corporate Research, 755 College Road East, Princeton, NJ, 08540, USA
Dorin Comaniciu & Visvanathan Ramesh

Authors

Ying Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Dorin Comaniciu
View author publications
You can also search for this author in PubMed Google Scholar
Stuart Schwartz
View author publications
You can also search for this author in PubMed Google Scholar
Visvanathan Ramesh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centre for Mathematical Sciences, Lund University, Box 118, 22100, Lund, Sweden
Anders Heyden & Gunnar Sparr &
The IT University of Copenhagen, Glentevej 67-69, 2400, Copenhagen, NW, Denmark
Mads Nielsen
University of Copenhagen, Universitetsparken 1, 2100, Copenhagen, Denmark
Peter Johansen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, Y., Comaniciu, D., Schwartz, S., Ramesh, V. (2002). Multimodal Data Representations with Parameterized Local Structures. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds) Computer Vision — ECCV 2002. ECCV 2002. Lecture Notes in Computer Science, vol 2350. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47969-4_12

Download citation

DOI: https://doi.org/10.1007/3-540-47969-4_12
Published: 29 April 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43745-1
Online ISBN: 978-3-540-47969-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics