Skip to main content

Motion Segmentation

  • Chapter
  • First Online:
Generalized Principal Component Analysis

Part of the book series: Interdisciplinary Applied Mathematics ((IAM,volume 40))

  • 9602 Accesses

Abstract

The previous two chapters have shown how to use a mixture of subspaces to represent and segment static images. In those cases, different subspaces were used to account for multiple characteristics of natural images, e.g., different textures. In this chapter, we will show how to use a mixture of subspaces to represent and segment time series, e.g., video and motion capture data. In particular, we will use different subspaces to account for multiple characteristics of the dynamics of a time series, such as multiple moving objects or multiple temporal events.

I can calculate the motion of heavenly bodies, but not the madness of people.

—Isaac Newton

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 89.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The special Euclidean group is defined as \(SE(3) =\{ (R,\boldsymbol{T}): R \in SO(3),\boldsymbol{T} \in \mathbb{R}^{3}\}\), where \(SO(3) =\{ R \in \mathbb{R}^{3\times 3}: R^{\top }R = I\ \text{and}\ \det (R) = 1\}\) is the special orthogonal group.

  2. 2.

    The inverse of \(g \in SE(3)\) is \(g^{-1} = (R^{\top },-R^{\top }\boldsymbol{T}) \in SE(3)\), and the product of two transformations \(g_{1} = (R_{1},\boldsymbol{T}_{1})\) and \(g_{2} = (R_{2},\boldsymbol{T}_{2})\) is defined as \(g_{1}g_{2} = (R_{1}R_{2},R_{1}\boldsymbol{T}_{1} +\boldsymbol{ T}_{2})\).

References

  • Aggarwal, G., Roy-Chowdhury, A., & Chellappa, R. (2004). A system identification approach for video-based face recognition. In Proceedings of International Conference on Pattern Recognition (pp. 23–26).

    Google Scholar 

  • Ali, S., Basharat, A., & Shah, M. (2007). Chaotic invariants for human action recognition. In Proceedings of International Conference on Computer Vision.

    Google Scholar 

  • Avidan, S., & Shashua, A. (2000). Trajectory triangulation: 3D reconstruction of moving points from a monocular image sequence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(4), 348–357.

    Article  Google Scholar 

  • Ayazoglu, M., Li, B., Dicle, C., Sznaier, M., & Camps, O. (2011). Dynamic subspace-based coordinated multicamera tracking. In IEEE International Conference on Computer Vision (pp. 2462–2469)

    Google Scholar 

  • Barbic, J., Safonova, A., Pan, J.-Y., Faloutsos, C., Hodgins, J. K., & Pollar, N. S. (2004). Segmenting motion capture data into distinct behaviors. In Graphics Interface.

    Google Scholar 

  • BĂ©jar, B., Zappella, L., & Vidal, R. (2012). Surgical gesture classification from video data. In Medical Image Computing and Computer Assisted Intervention (pp. 34–41).

    Google Scholar 

  • Bissacco, A., Chiuso, A., Ma, Y., & Soatto, S. (2001). Recognition of human gaits. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 52–58).

    Google Scholar 

  • Boult, T., & Brown, L. (1991). Factorization-based segmentation of motions. In IEEE Workshop on Motion Understanding (pp. 179–186).

    Google Scholar 

  • Chan, A., & Vasconcelos, N. (2005a). Classification and retrieval of traffic video using auto-regressive stochastic processes. In Proceedings of 2005 IEEE Intelligent Vehicles Symposium (pp. 771–776).

    Google Scholar 

  • Chan, A., & Vasconcelos, N. (2005b). Mixtures of dynamic textures. In IEEE International Conference on Computer Vision (Vol. 1, pp. 641–647).

    Google Scholar 

  • Chaudhry, R., Ravichandran, A., Hager, G., & Vidal, R. (2009). Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In IEEE Conference on Computer Vision and Pattern Recognition.

    Google Scholar 

  • CMU (2003). MOCAP database. http://mocap.cs.cmu.edu.

  • Costeira, J., & Kanade, T. (1998). A multibody factorization method for independently moving objects. International Journal of Computer Vision, 29(3), 159–179.

    Article  Google Scholar 

  • Doretto, G., Chiuso, A., Wu, Y., & Soatto, S. (2003). Dynamic textures. International Journal of Computer Vision, 51(2), 91–109.

    Article  MATH  Google Scholar 

  • Doretto, G., & Soatto, S. (2003). Editable dynamic textures. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. II, pp. 137–142).

    Google Scholar 

  • Doretto, G., & Soatto, S. (2006). Dynamic shape and appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(12), 2006–2019.

    Article  Google Scholar 

  • Feng, X., & Perona, P. (1998). Scene segmentation from 3D motion. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 225–231).

    Google Scholar 

  • Fitzgibbon, A., & Zisserman, A. (2000). Multibody structure and motion: 3D reconstruction of independently moving objects. In European Conference on Computer Vision (pp. 891–906).

    Google Scholar 

  • Ghoreyshi, A., & Vidal, R. (2007). Epicardial segmentation in dynamic cardiac MR sequences using priors on shape, intensity, and dynamics, in a level set framework. In IEEE International Symposium on Biomedical Imaging (pp. 860–863).

    Google Scholar 

  • Han, M., & Kanade, T. (2000). Reconstruction of a scene with multiple linearly moving objects. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 542–549).

    Google Scholar 

  • Han, M., & Kanade, T. (2001). Multiple motion scene reconstruction from uncalibrated views. In Proceedings of IEEE International Conference on Computer Vision (Vol. 1, pp. 163–170).

    Google Scholar 

  • Hartley, R., & Vidal, R. (2004). The multibody trifocal tensor: Motion segmentation from 3 perspective views. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. I, pp. 769–775).

    Google Scholar 

  • Hyndman, M., Jepson, A., & Fleet, D. J. (2007). Higher-order autoregressive models for dynamic textures. In British Machine Vision Conference (pp. 76.1–76.10). doi:10.5244/C.21.76.

  • Kanatani, K. (2001). Motion segmentation by subspace separation and model selection. In IEEE International Conference on Computer Vision (Vol. 2, pp. 586–591).

    Google Scholar 

  • Kanatani, K., & Matsunaga, C. (2002). Estimating the number of independent motions for multibody motion segmentation. In European Conference on Computer Vision (pp. 25–31).

    Google Scholar 

  • Kanatani, K., & Sugaya, Y. (2003). Multi-stage optimization for multi-body motion segmentation. In Australia-Japan Advanced Workshop on Computer Vision (pp. 335–349).

    Google Scholar 

  • Kim, S. J., Doretto, G., Rittscher, J., Tu, P., Krahnstoever, N., & Pollefeys, M. (2009). A model change detection approach to dynamic scene modeling. In Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, 2009 (AVSS ’09) (pp. 490–495).

    Google Scholar 

  • Li, B., Ayazoglu, M., Mao, T., Camps, O. I., & Sznaier, M. (2011). Activity recognition using dynamic subspace angles. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 3193–3200). New York: IEEE.

    Google Scholar 

  • Ma, Y., Soatto, S., Kosecka, J., & Sastry, S. (2003). An Invitation to 3D Vision: From Images to Geometric Models. New York: Springer.

    MATH  Google Scholar 

  • Nascimento, J. C., Figueiredo, M. A. T., & Marques, J. S. (2005). Recognition of human activities using space dependent switched dynamical models. In IEEE International Conference on Image Processing (pp. 852–855).

    Google Scholar 

  • Nunez, F., & Cipriano, A. (2009). Visual information model based predictor for froth speed control in flotation process. Minerals Engineering, 22(4), 366–371.

    Article  Google Scholar 

  • Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., & Bajcsy, R. (2013). Berkeley MHAD: A comprehensive multimodal human action database. In IEEE Workshop on Applications of Computer Vision.

    Google Scholar 

  • Overschee, P. V., & Moor, B. D. (1993). Subspace algorithms for the stochastic identification problem. Automatica, 29(3), 649–660.

    Article  MathSciNet  MATH  Google Scholar 

  • Rahimi, A., Darrell, T., & Recht, B. (2005). Learning appearance manifolds from video. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. 1, pp. 868–875).

    Google Scholar 

  • Ravichandran, A., Chaudhry, R., & Vidal, R. (2009). View-invariant dynamic texture recognition using a bag of dynamical systems. In IEEE Conference on Computer Vision and Pattern Recognition.

    Google Scholar 

  • Ravichandran, A., Chaudhry, R., & Vidal, R. (2013). Categorizing dynamic textures using a bag of dynamical systems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(2), 342–353.

    Article  Google Scholar 

  • Ravichandran, A., & Vidal, R. (2008). Video registration using dynamic textures. In European Conference on Computer Vision.

    Google Scholar 

  • Ravichandran, A., & Vidal, R. (2011). Video registration using dynamic textures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(1), 158–171.

    Article  Google Scholar 

  • Ravichandran, A., Vidal, R., & Halperin, H. (2006). Segmenting a beating heart using polysegment and spatial GPCA. In IEEE International Symposium on Biomedical Imaging (pp. 634–637).

    Google Scholar 

  • Saisan, P., Bissacco, A., Chiuso, A., & Soatto, S. (2004). Modeling and synthesis of facial motion driven by speech. In European Conference on Computer Vision (Vol. 3, pp. 456–467).

    Google Scholar 

  • Shakernia, O., Vidal, R., & Sastry, S. (2003). Multi-body motion estimation and segmentation from multiple central panoramic views. In IEEE International Conference on Robotics and Automation (Vol. 1, pp. 571–576).

    Google Scholar 

  • Shashua, A., & Levin, A. (2001). Multi-frame infinitesimal motion model for the reconstruction of (dynamic) scenes with multiple linearly moving objects. In Proceedings of IEEE International Conference on Computer Vision (Vol. 2, pp. 592–599).

    Google Scholar 

  • Sturm, P. (2002). Structure and motion for dynamic scenes - the case of points moving in planes. In Proceedings of European Conference on Computer Vision (pp. 867–882).

    Google Scholar 

  • Szummer, M., & Picard, R. W. (1996). Temporal texture modeling. In IEEE International Conference on Image Processing (Vol. 3, pp. 823–826).

    Google Scholar 

  • Torr, P., Szeliski, R., & Anandan, P. (2001). An integrated Bayesian approach to layer extraction from image sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(3), 297–303.

    Article  Google Scholar 

  • Torr, P. H. S. (1998). Geometric motion segmentation and model selection. Philosophical Transactions of the Royal Society of London, 356(1740), 1321–1340.

    Article  MathSciNet  MATH  Google Scholar 

  • Tron, R., & Vidal, R. (2007). A benchmark for the comparison of 3-D motion segmentation algorithms. In IEEE Conference on Computer Vision and Pattern Recognition.

    Google Scholar 

  • Turaga, P., Veeraraghavan, A., Srivastava, A., & Chellappa, R. (2011). Statistical computations on special manifolds for image and video-based recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(11), 2273–2286.

    Article  Google Scholar 

  • Vidal, R. (2005). Multi-subspace methods for motion segmentation from affine, perspective and central panoramic cameras. In IEEE Conference on Robotics and Automation (pp. 1753–1758).

    Google Scholar 

  • Vidal, R., & Hartley, R. (2004). Motion segmentation with missing data by PowerFactorization and Generalized PCA. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. II, pp. 310–316).

    Google Scholar 

  • Vidal, R., & Ma, Y. (2004). A unified algebraic approach to 2-D and 3-D motion segmentation. In European Conference on Computer Vision (pp. 1–15).

    Google Scholar 

  • Vidal, R., Ma, Y., Soatto, S., & Sastry, S. (2006). Two-view multibody structure from motion. International Journal of Computer Vision, 68(1), 7–25.

    Article  Google Scholar 

  • Vidal, R., & Ravichandran, A. (2005). Optical flow estimation and segmentation of multiple moving dynamic textures. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. II, pp. 516–521).

    Google Scholar 

  • Vidal, R., & Sastry, S. (2003). Optimal segmentation of dynamic scenes from two perspective views. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 281–286).

    Google Scholar 

  • Vidal, R., Soatto, S., Ma, Y., & Sastry, S. (2002b). Segmentation of dynamic scenes from the multibody fundamental matrix. In ECCV Workshop on Visual Modeling of Dynamic Scenes.

    Google Scholar 

  • Wang, J. M., Fleet, D. J., & Hertzmann, A. (2008b). Gaussian process dynamical models for human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 283–298.

    Google Scholar 

  • Wolf, L., & Shashua, A. (2001a). Affine 3-D reconstruction from two projective images of independently translating planes. In Proceedings of IEEE International Conference on Computer Vision (pp. 238–244).

    Google Scholar 

  • Wolf, L., & Shashua, A. (2001b). Two-body segmentation from two perspective views. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 263–270).

    Google Scholar 

  • Woolfe, F., & Fitzgibbon, A. (2006). Shift-invariant dynamic texture recognition. In Proceedings of European Conference on Computer Vision, pages II: 549–562.

    Google Scholar 

  • Wu, Y., Zhang, Z., Huang, T., & Lin, J. (2001). Multibody grouping via orthogonal subspace decomposition. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 252–257).

    Google Scholar 

  • Xiong, F., Camps, O., & Sznaier, M. (2011). Low order dynamics embedding for high dimensional time series. In IEEE International Conference on Computer Vision (pp. 2368–2374).

    Google Scholar 

  • Xiong, F., Camps, O., & Sznaier, M. (2012). Dynamic context for tracking behind occlusions. In European Conference on Computer Vision. Lecture notes in computer science (Vol. 7576, pp. 580–593). Berlin/Heidelberg: Springer.

    Google Scholar 

  • Yuan, L., Wen, F., Liu, C., & Shum, H. (2004). Synthesizing dynamic texture with closed-loop linear dynamic system. In European Conference on Computer Vision (pp. 603–616).

    Google Scholar 

  • Zelnik-Manor, L., & Irani, M. (2003). Degeneracies, dependencies and their implications in multi-body and multi-sequence factorization. In IEEE Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 287–293).

    Google Scholar 

  • Zhang, T., Szlam, A., Wang, Y., & Lerman, G. (2010). Randomized hybrid linear modeling via local best-fit flats. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 1927–1934).

    Google Scholar 

  • Zhou, F., la Torre, F. D., & Hodgins, J. K. (2008). Aligned cluster analysis for temporal segmentation of human motion. In International Conference on Automatic Face and Gesture Recognition.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag New York

About this chapter

Cite this chapter

Vidal, R., Ma, Y., Sastry, S.S. (2016). Motion Segmentation. In: Generalized Principal Component Analysis. Interdisciplinary Applied Mathematics, vol 40. Springer, New York, NY. https://doi.org/10.1007/978-0-387-87811-9_11

Download citation

Publish with us

Policies and ethics