Abstract
We present a novel approach for automatically learning models of temporal trajectories extracted from video data. Instead of using a representation of linearly time-normalised vectors of fixed-length, our approach makes use of Dynamic Time Warp distance as a similarity measure to capture the underlying ordered structure of variable-length temporal data while removing the non-linear warping of the time scale. We reformulate the structure learning problem as an optimal graph-partitioning of the dataset to solely exploit Dynamic Time Warp similarity weights without the need for intermediate cluster centroid representations. We extend the graph partitioning method and in particular, the Normalised Cut model originally introduced for static image segmentation to unsupervised clustering of temporal trajectories with fully automated model order selection. By computing hierarchical average Dynamic Time Warp for each cluster, we learn warp-free trajectory models and recover the time warp profiles and structural variance in the data. We demonstrate the approach on modelling trajectories of continuous hand-gestures and moving objects in an indoor environment.
Chapter PDF
Similar content being viewed by others
Keywords
References
M.J. Black and A.D. Jepson. A probabilistic framework for matching temporal trajectories: Condensation-based recognition of gestures and expressions. In ECCV, pages 909–924, Freiburg, 1998.
M. Brand and V. Kettnaker. Discovery and segmentation of activities in video. IEEE PAMI, 22(8):844–851, August 2000.
R.A. Fisher. The statistical utilization of multiple measurements. Annals of Eugenics, 8:376–386, 1938.
S. Furui. Vector-quantization-based speech recognition and speaker recognition techniques. In Asilomar Conference on Signals, Systems and Computers, volume 2, pages 954–958, Los Alamitos, USA, 1991.
T. Jebara and A. Pentland. Automatic visual analysis and synthesis of interactive behaviour. In International Conference on Vision Systems, pages 273–292, Berlin, Germany, 1999.
N. Johnson and D.C. Hogg. Learning the distribution of object trajectories for event recognition. IVC, 14(8):609–615, 1996.
J.B. Kruskal and M. Liberman. The symmetric time-warping problem: From continuous to discrete. In Time Warps, String Edits, And Macromolecules: The Theory and Practice of Sequence Comparison, pages 125–161. CSLI Publications, 1999.
V.I. Levenshtein. Binary codes capable of correcting spurious insertions and deletions of ones. Cybernetics and Control Theory, 10(8):707–710, 1965.
S.Y. Lee M.K. Shan. Content-based video retrieval via motion trajectories. SPIE, 3562:52–61, 1998.
A. Robles-Kelly and E.R. Hancock. An em-like algorithm for motion segmentation via eigendecomposition. In BMVC, pages 123–132, Manchester, U.K., 2001.
S. Sarkar and K.L. Boyer. Quantitative measures of change based on feature organisation: Eigenvalues and eigenvectors. CVIU, 71(1):110–136, July 1998.
G.L. Scott and H.C. Longuet-Higgins. Feature grouping by relocalisation of eigenvectors of the proximity matrix. In BMVC, pages 103–108, 1990.
J. Sherrah and S. Gong. Resolving visual uncertainty and occlusion through probabilistic reasoning. In BMVC, pages 252–261, Bristol, UK, 2000.
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE PAMI, 22(8):888–905, August 2000.
T. Wada and T. Matsuyama. Multiobject behavior recognition by event driven selective attention method. IEEE PAMI, 22(8):873–887, 2000.
M. Walter, A. Psarrou, and S. Gong. Data driven gesture model acquisition using minimum description length. In BMVC, pages 673–683, Manchester, UK, 2001.
Y. Weiss. Segmentation using eigenvectors: A unifying view. In ICCV, pages 975–982, Los Alamitos, USA, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ng, J., Gong, S. (2002). Learning Intrinsic Video Content Using Levenshtein Distance in Graph Partitioning. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds) Computer Vision — ECCV 2002. ECCV 2002. Lecture Notes in Computer Science, vol 2353. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47979-1_45
Download citation
DOI: https://doi.org/10.1007/3-540-47979-1_45
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43748-2
Online ISBN: 978-3-540-47979-6
eBook Packages: Springer Book Archive