Skip to main content

Distances on Spaces of High-Dimensional Linear Stochastic Processes: A Survey

  • Chapter
  • First Online:
Geometric Theory of Information

Part of the book series: Signals and Communication Technology ((SCT))

Abstract

In this paper we study the geometrization of certain spaces of stochastic processes. Our main motivation comes from the problem of pattern recognition in high-dimensional time-series data (e.g., video sequence classification and clustering). In the first part of the paper, we provide a rather extensive review of some existing approaches to defining distances on spaces of stochastic processes. The majority of these distances are, in one way or another, based on comparing power spectral densities of the processes. In the second part, we focus on the space of processes generated by (stochastic) linear dynamical systems (LDSs) of fixed size and order, for which we recently introduced a class of group action induced distances called the alignment distances. This space is a natural choice in some pattern recognition applications and is also of great interest in control theory, where it is often convenient to represent LDSs in state-space form. In this case the space (more precisely manifold) of LDSs can be considered as the base space of a principal fiber bundle comprised of state-space realizations. This is due to a Lie group action symmetry present in the state-space representation of LDSs. The basic idea behind the alignment distance is to compare two LDSs by first aligning a pair of their realizations along the respective fibers. Upon a standardization (or bundle reduction) step this alignment process can be expressed as a minimization problem over orthogonal matrices, which can be solved efficiently. The alignment distance differs from most existing distances in that it is a structural or generative distance, since in some sense it compares how two processes are generated. We also briefly discuss averaging LDSs using the alignment distance via minimizing a sum of the squares of distances (namely, the so-called Fréchet mean).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Typically in video analysis: \(p\,\approx \,\)1000–10000, \(m, n\approx 10\) (see e.g., [1, 12, 60]).

  2. 2.

    Note that in a different or more general setting the noise at the output could be a process \(\varvec{w}_t\) different (independent) from the input noise \(\varvec{v}_t\). This does not cause major changes in our developments. Since the output noise usually represents a perturbation which cannot be modeled, as far as Problem 1 is concerned, one could usually assume that \(D_i=0\).

  3. 3.

    Note that we are not implying that ARMA models are incapable of modeling such time series. Rather the issue is that general or unrestricted ARMA models suffer from the curse of dimensionality in the identification problem, and the parametrization of a restricted class of ARMA models with a small number of parameters is complicated [20]. However, at the same time, by using state-space models it is easier to overcome the curse of dimensionality and this approach naturally leads to simple and effective identification algorithms [20, 22].

  4. 4.

    Strictly speaking, in order to be the PSD matrix of a regular stationary process, a matrix function on \([0,2\pi ]\) must satisfy other mild technical conditions (see [62] for details).

  5. 5.

    In fact, our approach (in Sects. 8.38.5) is also based on the idea of comparing the minimum phase (i.e., canonical) filters or factors in the case of processes with rational spectra. However, instead of comparing the associated transfer functions or impulse responses, we try to compare the associated state-space realizations (in a specific sense). This approach, therefore, is in some sense structural or generative, since it tries to compare how the processes are generated (according to the state-space representation) and the model order plays an explicit role in it.

  6. 6.

    Notice that defining distances between probability densities in the time domain is a more general approach than the PSD-based approaches, and it can be employed in the case of nonstationary as well as non-Gaussian processes. However, such an approach, in general, is computationally difficult.

  7. 7.

    Interestingly, for an average defined based on the Itakura-Saito divergence in the space of 1D AR models this property holds [26], see also [5, Sect. 5.3].

  8. 8.

    It is interesting to note that by a simple modification some of the spectral-ratio based distances can attain this property, e.g., by modifying \(d_\mathrm{R }\) in (8.8) as \(d_\mathrm{RI }^2({\varvec{y}}^1,{\varvec{y}}^2)=\int \big (\log \big (\frac{P_{{\varvec{y}}^1}}{P_{{\varvec{y}}^2}}\big )\big )^2\mathrm d \omega -\big (\int \log \big (\frac{P_{{\varvec{y}}^1}}{P_{{\varvec{y}}^2}}\big )\mathrm d \omega \big )^2\) (see also [9, 25, 49]).

  9. 9.

    This and the results in [53] underline the fact that defining distances on \(\mathcal {P}_p\) for \(p>1\) may be challenging, not only from a computational point of view but also from a theoretical one. In particular, certain nice properties in 1D do not automatically carry over to higher dimensions by a simple extension of the definitions in 1D.

  10. 10.

    It is crucial to have in mind that we explicitly distinguish between the LDS, \(M\), and its realization \(R\), which is not unique. As it will become clear soon, an LDS has an equivalent class of realizations.

  11. 11.

    These rank conditions, interestingly, have differential geometric significance in yielding nice quotient spaces, see Sect. 8.4.

  12. 12.

    Strictly speaking \(\bullet \) is a right action; however, it is notationally convenient to write it as a left action in (8.12).

  13. 13.

    We may call this an alignment distance. However, based on the same principle in Sect. 8.5 we define another group action induced distance, which we explicitly call the alignment distance. Since our main object of interest is that distance, we prefer not to call the distance in (8.13) an alignment distance.

  14. 14.

    It is interesting to note that some of the good properties of the \(k\)-nearest neighborhood algorithms on a general metric space depend on the triangle inequality [21].

  15. 15.

    This problem, in general, is difficult, among other things, because it is a non-convex (infinite-dimensional) variational problem. Recall that in Riemannian geometry the non-convexity of the arc length variational problem can be related to the non-trivial topology of the manifold (see e.g., [17]).

References

  1. Afsari, B., Chaudhry, R., Ravichandran, A., Vidal, R.: Group action induced distances for averaging and clustering linear dynamical systems with applications to the analysis of dynamic visual scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)

    Google Scholar 

  2. Afsari, B., Vidal, R.: The alignment distance on spaces of linear dynamical systems. In: IEEE Conference on Decision and Control (2013)

    Google Scholar 

  3. Afsari, B., Vidal, R.: Group action induced distances on spaces of high-dimensional linear stochastic processes. In: Geometric Science of Information, LNCS, vol. 8085, pp. 425–432 (2013)

    Google Scholar 

  4. Amari, S.I.: Differential geometry of a parametric family of invertible linear systems-Riemannian metric, dual affine connections, and divergence. Math. Syst. Theory 20, 53–82 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  5. Amari, S.I., Nagaoka, H.: Methods of information geometry. In: Translations of Mathematical Monographs, vol. 191. American Mathematical Society, Providence (2000)

    Google Scholar 

  6. Anderson, B.D., Deistler, M.: Properties of zero-free spectral matrices. IEEE Trans. Autom. Control 54(10), 2365–5 (2009)

    Article  MathSciNet  Google Scholar 

  7. Aoki, M.: State Space Modeling of Time Series. Springer, Berlin (1987)

    Google Scholar 

  8. Barbaresco, F.: Information geometry of covariance matrix: Cartan-Siegel homogeneous bounded domains, Mostow/Berger fibration and Frechet median. In: Matrix Information Geometry, pp. 199–255. Springer, Berlin (2013)

    Google Scholar 

  9. Basseville, M.: Distance measures for signal processing and pattern recognition. Sig. Process. 18, 349–9 (1989)

    Article  MathSciNet  Google Scholar 

  10. Basseville, M.: Divergence measures for statistical data processingan annotated bibliography. Sig. Process. 93(4), 621–33 (2013)

    Article  MathSciNet  Google Scholar 

  11. Bauer, D., Deistler, M.: Balanced canonical forms for system identification. IEEE Trans. Autom. Control 44(6), 1118–1131 (1999)

    Google Scholar 

  12. Béjar, B., Zappella, L., Vidal, R.: Surgical gesture classification from video data. In: Medical Image Computing and Computer Assisted Intervention, pp. 34–41 (2012)

    Google Scholar 

  13. Boets, J., Cock, K.D., Moor, B.D.: A mutual information based distance for multivariate Gaussian processes. In: Modeling, Estimation and Control, Festschrift in Honor of Giorgio Picci on the Occasion of his Sixty-Fifth Birthday, Lecture Notes in Control and Information Sciences, vol. 364, pp. 15–33. Springer, Berlin (2007)

    Google Scholar 

  14. Bonnabel, S., Collard, A., Sepulchre, R.: Rank-preserving geometric means of positive semi-definite matrices. Linear Algebra. Its Appl. 438, 3202–16 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  15. Byrnes, C.I., Hurt, N.: On the moduli of linear dynamical systems. In: Advances in Mathematical Studies in Analysis, vol. 4, pp. 83–122. Academic Press, New York (1979)

    Google Scholar 

  16. Chaudhry, R., Vidal, R.: Recognition of visual dynamical processes: Theory, kernels and experimental evaluation. Technical Report 09–01. Department of Computer Science, Johns Hopkins University (2009)

    Google Scholar 

  17. Chavel, I.: Riemannian Geometry: A Modern Introduction, vol. 98, 2nd edn. Cambridge University Press, Cambridge (2006)

    Google Scholar 

  18. Cock, K.D., Moor, B.D.: Subspace angles and distances between ARMA models. Syst. Control Lett. 46(4), 265–70 (2002)

    Article  MATH  Google Scholar 

  19. Corduas, M., Piccolo, D.: Time series clustering and classification by the autoregressive metric. Comput. Stat. Data Anal. 52(4), 1860–72 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  20. Deistler, M., Anderson, B.O., Filler, A., Zinner, C., Chen, W.: Generalized linear dynamic factor models: an approach via singular autoregressions. Eur. J. Control 3, 211–24 (2010)

    Article  MathSciNet  Google Scholar 

  21. Devroye, L.: A probabilistic Theory of Pattern Recognition, vol. 31. Springer, Berlin (1996)

    Google Scholar 

  22. Doretto, G., Chiuso, A., Wu, Y., Soatto, S.: Dynamic textures. Int. J. Comput. Vision 51(2), 91–109 (2003)

    Article  MATH  Google Scholar 

  23. Ferrante, A., Pavon, M., Ramponi, F.: Hellinger versus Kullback-Leibler multivariable spectrum approximation. IEEE Trans. Autom. Control 53(4), 954–67 (2008)

    Article  MathSciNet  Google Scholar 

  24. Forni, M., Hallin, M., Lippi, M., Reichlin, L.: The generalized dynamic-factor model: Identification and estimation. Rev. Econ. Stat. 82(4), 540–54 (2000)

    Article  Google Scholar 

  25. Georgiou, T.T., Karlsson, J., Takyar, M.S.: Metrics for power spectra: an axiomatic approach. IEEE Trans. Signal Process. 57(3), 859–67 (2009)

    Article  MathSciNet  Google Scholar 

  26. Gray, R., Buzo, A., Gray Jr, A., Matsuyama, Y.: Distortion measures for speech processing. IEEE Trans. Acoust. Speech Signal Process. 28(4), 367–76 (1980)

    Article  MATH  Google Scholar 

  27. Gray, R.M.: Probability, Random Processes, and Ergodic Properties. Springer, Berlin (2009)

    Google Scholar 

  28. Gray, R.M., Neuhoff, D.L., Shields, P.C.: A generalization of Ornstein’s \(\bar{d}\) distance with applications to information theory. The Ann. Probab. 3, 315–328 (1975)

    Google Scholar 

  29. Gray Jr, A., Markel, J.: Distance measures for speech processing. IEEE Trans. Acoust. Speech Signal Process. 24(5), 380–91 (1976)

    Article  MathSciNet  Google Scholar 

  30. Grenander, U.: Abstract Inference. Wiley, New York (1981)

    Google Scholar 

  31. Hannan, E.J.: Multiple Time Series, vol. 38. Wiley, New York (1970)

    Google Scholar 

  32. Hannan, E.J., Deistler, M.: The Statistical Theory of Linear Systems. Wiley, New York (1987)

    Google Scholar 

  33. Hanzon, B.: Identifiability, Recursive Identification and Spaces of Linear Dynamical Systems, vol. 63–64. Centrum voor Wiskunde en Informatica (CWI), Amsterdam (1989)

    Google Scholar 

  34. Hanzon, B., Marcus, S.I.: Riemannian metrics on spaces of stable linear systems, with applications to identification. In: IEEE Conference on Decision & Control, pp. 1119–1124 (1982)

    Google Scholar 

  35. Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning. Springer, New York (2003)

    Google Scholar 

  36. Hazewinkel, M.: Moduli and canonical forms for linear dynamical systems II: the topological case. Math. Syst. Theory 10, 363–85 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  37. Helmke, U.: Balanced realizations for linear systems: a variational approach. SIAM J. Control Optim. 31(1), 1–15 (1993)

    Google Scholar 

  38. Jiang, X., Ning, L., Georgiou, T.T.: Distances and Riemannian metrics for multivariate spectral densities. IEEE Trans. Autom. Control 57(7), 1723–35 (2012)

    Article  MathSciNet  Google Scholar 

  39. Jimenez, N.D., Afsari, B., Vidal, R.: Fast Jacobi-type algorithm for computing distances between linear dynamical systems. In: European Control Conference (2013)

    Google Scholar 

  40. Kailath, T.: Linear Systems. Prentice Hall, NJ (1980)

    Google Scholar 

  41. Katayama, T.: Subspace Methods for System Identification. Springer, Berlin (2005)

    Google Scholar 

  42. Kazakos, D., Papantoni-Kazakos, P.: Spectral distance measures between Gaussian processes. IEEE Trans. Autom. Control 25(5), 950–9 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  43. Kendall, D.G., Barden, D., Carne, T.K., Le, H.: Shape and Shape Theory. Wiley Series In Probability And Statistics. Wiley, New York (1999)

    Google Scholar 

  44. Kobayashi, S., Nomizu, K.: Foundations of Differential Geometry Volume I. Wiley Classics Library Edition. Wiley, New York (1963)

    Google Scholar 

  45. Krishnaprasad, P.S.: Geometry of Minimal Systems and the Identification Problem. PhD thesis, Harvard University (1977)

    Google Scholar 

  46. Krishnaprasad, P.S., Martin, C.F.: On families of systems and deformations. Int. J. Control 38(5), 1055–79 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  47. Lee, J.M.: Introduction to Smooth Manifolds. Springer, Graduate Texts in Mathematics (2002)

    Google Scholar 

  48. Liao, T.W.: Clustering time series data—a survey. Pattern Recogn. 38, 1857–74 (2005)

    Article  MATH  Google Scholar 

  49. Makhoul, J.: Linear prediction: a tutorial review. Proc. IEEE 63(4), 561–80 (1975)

    Article  Google Scholar 

  50. Martin, A.: A metric for ARMA processes. IEEE Trans. Signal Process. 48(4), 1164–70 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  51. Moor, B.D., Overschee, P.V., Suykens, J.: Subspace algorithms for system identification and stochastic realization. Technical Report ESAT-SISTA Report 1990–28, Katholieke Universiteit Leuven (1990)

    Google Scholar 

  52. Moore, B.C.: Principal component analysis in linear systems: Controllability, observability, and model reduction. IEEE Trans. Autom. Control 26, 17–32 (1981)

    Article  MATH  Google Scholar 

  53. Ning, L., Georgiou, T.T., Tannenbaum, A.: Matrix-valued Monge-Kantorovich optimal mass transport. arXiv, preprint arXiv:1304.3931 (2013)

    Google Scholar 

  54. Nocerino, N., Soong, F.K., Rabiner, L.R., Klatt, D.H.: Comparative study of several distortion measures for speech recognition. Speech Commun. 4(4), 317–31 (1985)

    Article  Google Scholar 

  55. Ober, R.J.: Balanced realizations: canonical form, parametrization, model reduction. Int. J. Control 46(2), 643–70 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  56. Papoulis, A., Pillai, S.U.: Probability, random variables and stochastic processes with errata sheet. McGraw-Hill Education, New York (2002)

    Google Scholar 

  57. Piccolo, D.: A distance measure for classifying ARIMA models. J. Time Ser. Anal. 11(2), 153–64 (1990)

    Article  MATH  Google Scholar 

  58. Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall International, NJ (1993)

    Google Scholar 

  59. Rao, M.M.: Stochastic Processes: Inference Theory, vol. 508. Springer, New York (2000)

    Google Scholar 

  60. Ravichandran, A., Vidal, R.: Video registration using dynamic textures. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 158–171 (2011)

    Google Scholar 

  61. Ravishanker, N., Melnick, E.L., Tsai, C.-L.: Differential geometry of ARMA models. J. Time Ser. Anal. 11(3), 259–274 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  62. Rozanov, Y.A.: Stationary Random Processes. Holden-Day, San Francisco (1967)

    Google Scholar 

  63. Vandereycken, B., Absil, P.-A., Vandewalle, S.: A Riemannian geometry with complete geodesics for the set of positive semi-definite matrices of fixed rank. Technical Report Report TW572, Katholieke Universiteit Leuven (2010)

    Google Scholar 

  64. Vishwanathan, S., Smola, A., Vidal, R.: Binet-Cauchy kernels on dynamical systems and its application to the analysis of dynamic scenes. Int. J. Comput. Vision 73(1), 95–119 (2007)

    Article  Google Scholar 

  65. Youla, D.: On the factorization of rational matrices. IRE Trans. Inf. Theory 7(3), 172–189 (1961)

    Article  Google Scholar 

  66. Younes, L.: Shapes and Diffeomorphisms. In: Applied Mathematical Sciences, vol. 171. Springer, New York (2010)

    Google Scholar 

Download references

Acknowledgments

The authors are thankful to the anonymous reviewers for their insightful comments and suggestions, which helped to improve the quality of this paper. The authors also thank the organizers of the GSI 2013 conference and the editor of this book Prof. Frank Nielsen. This work was supported by the Sloan Foundation and by grants ONR N00014-09-10084, NSF 0941362, NSF 0941463, NSF 0931805, and NSF 1335035.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bijan Afsari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Afsari, B., Vidal, R. (2014). Distances on Spaces of High-Dimensional Linear Stochastic Processes: A Survey. In: Nielsen, F. (eds) Geometric Theory of Information. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-05317-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-05317-2_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-05316-5

  • Online ISBN: 978-3-319-05317-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics