Distances on Spaces of High-Dimensional Linear Stochastic Processes: A Survey

Afsari, Bijan; Vidal, René

doi:10.1007/978-3-319-05317-2_8

Bijan Afsari² &
René Vidal²

Part of the book series: Signals and Communication Technology ((SCT))

1823 Accesses
2 Citations

Abstract

In this paper we study the geometrization of certain spaces of stochastic processes. Our main motivation comes from the problem of pattern recognition in high-dimensional time-series data (e.g., video sequence classification and clustering). In the first part of the paper, we provide a rather extensive review of some existing approaches to defining distances on spaces of stochastic processes. The majority of these distances are, in one way or another, based on comparing power spectral densities of the processes. In the second part, we focus on the space of processes generated by (stochastic) linear dynamical systems (LDSs) of fixed size and order, for which we recently introduced a class of group action induced distances called the alignment distances. This space is a natural choice in some pattern recognition applications and is also of great interest in control theory, where it is often convenient to represent LDSs in state-space form. In this case the space (more precisely manifold) of LDSs can be considered as the base space of a principal fiber bundle comprised of state-space realizations. This is due to a Lie group action symmetry present in the state-space representation of LDSs. The basic idea behind the alignment distance is to compare two LDSs by first aligning a pair of their realizations along the respective fibers. Upon a standardization (or bundle reduction) step this alignment process can be expressed as a minimization problem over orthogonal matrices, which can be solved efficiently. The alignment distance differs from most existing distances in that it is a structural or generative distance, since in some sense it compares how two processes are generated. We also briefly discuss averaging LDSs using the alignment distance via minimizing a sum of the squares of distances (namely, the so-called Fréchet mean).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Typically in video analysis: \(p\,\approx \,\)1000–10000, \(m, n\approx 10\) (see e.g., [1, 12, 60]).
2.
Note that in a different or more general setting the noise at the output could be a process \(\varvec{w}_t\) different (independent) from the input noise \(\varvec{v}_t\). This does not cause major changes in our developments. Since the output noise usually represents a perturbation which cannot be modeled, as far as Problem 1 is concerned, one could usually assume that \(D_i=0\).
3.
Note that we are not implying that ARMA models are incapable of modeling such time series. Rather the issue is that general or unrestricted ARMA models suffer from the curse of dimensionality in the identification problem, and the parametrization of a restricted class of ARMA models with a small number of parameters is complicated [20]. However, at the same time, by using state-space models it is easier to overcome the curse of dimensionality and this approach naturally leads to simple and effective identification algorithms [20, 22].
4.
Strictly speaking, in order to be the PSD matrix of a regular stationary process, a matrix function on \([0,2\pi ]\) must satisfy other mild technical conditions (see [62] for details).
5.
In fact, our approach (in Sects. 8.3–8.5) is also based on the idea of comparing the minimum phase (i.e., canonical) filters or factors in the case of processes with rational spectra. However, instead of comparing the associated transfer functions or impulse responses, we try to compare the associated state-space realizations (in a specific sense). This approach, therefore, is in some sense structural or generative, since it tries to compare how the processes are generated (according to the state-space representation) and the model order plays an explicit role in it.
6.
Notice that defining distances between probability densities in the time domain is a more general approach than the PSD-based approaches, and it can be employed in the case of nonstationary as well as non-Gaussian processes. However, such an approach, in general, is computationally difficult.
7.
Interestingly, for an average defined based on the Itakura-Saito divergence in the space of 1D AR models this property holds [26], see also [5, Sect. 5.3].
8.
It is interesting to note that by a simple modification some of the spectral-ratio based distances can attain this property, e.g., by modifying \(d_\mathrm{R }\) in (8.8) as \(d_\mathrm{RI }^2({\varvec{y}}^1,{\varvec{y}}^2)=\int \big (\log \big (\frac{P_{{\varvec{y}}^1}}{P_{{\varvec{y}}^2}}\big )\big )^2\mathrm d \omega -\big (\int \log \big (\frac{P_{{\varvec{y}}^1}}{P_{{\varvec{y}}^2}}\big )\mathrm d \omega \big )^2\) (see also [9, 25, 49]).
9.
This and the results in [53] underline the fact that defining distances on \(\mathcal {P}_p\) for \(p>1\) may be challenging, not only from a computational point of view but also from a theoretical one. In particular, certain nice properties in 1D do not automatically carry over to higher dimensions by a simple extension of the definitions in 1D.
10.
It is crucial to have in mind that we explicitly distinguish between the LDS, \(M\), and its realization \(R\), which is not unique. As it will become clear soon, an LDS has an equivalent class of realizations.
11.
These rank conditions, interestingly, have differential geometric significance in yielding nice quotient spaces, see Sect. 8.4.
12.
Strictly speaking \(\bullet \) is a right action; however, it is notationally convenient to write it as a left action in (8.12).
13.
We may call this an alignment distance. However, based on the same principle in Sect. 8.5 we define another group action induced distance, which we explicitly call the alignment distance. Since our main object of interest is that distance, we prefer not to call the distance in (8.13) an alignment distance.
14.
It is interesting to note that some of the good properties of the \(k\)-nearest neighborhood algorithms on a general metric space depend on the triangle inequality [21].
15.
This problem, in general, is difficult, among other things, because it is a non-convex (infinite-dimensional) variational problem. Recall that in Riemannian geometry the non-convexity of the arc length variational problem can be related to the non-trivial topology of the manifold (see e.g., [17]).

References

Afsari, B., Chaudhry, R., Ravichandran, A., Vidal, R.: Group action induced distances for averaging and clustering linear dynamical systems with applications to the analysis of dynamic visual scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
Google Scholar
Afsari, B., Vidal, R.: The alignment distance on spaces of linear dynamical systems. In: IEEE Conference on Decision and Control (2013)
Google Scholar
Afsari, B., Vidal, R.: Group action induced distances on spaces of high-dimensional linear stochastic processes. In: Geometric Science of Information, LNCS, vol. 8085, pp. 425–432 (2013)
Google Scholar
Amari, S.I.: Differential geometry of a parametric family of invertible linear systems-Riemannian metric, dual affine connections, and divergence. Math. Syst. Theory 20, 53–82 (1987)
Article MATH MathSciNet Google Scholar
Amari, S.I., Nagaoka, H.: Methods of information geometry. In: Translations of Mathematical Monographs, vol. 191. American Mathematical Society, Providence (2000)
Google Scholar
Anderson, B.D., Deistler, M.: Properties of zero-free spectral matrices. IEEE Trans. Autom. Control 54(10), 2365–5 (2009)
Article MathSciNet Google Scholar
Aoki, M.: State Space Modeling of Time Series. Springer, Berlin (1987)
Google Scholar
Barbaresco, F.: Information geometry of covariance matrix: Cartan-Siegel homogeneous bounded domains, Mostow/Berger fibration and Frechet median. In: Matrix Information Geometry, pp. 199–255. Springer, Berlin (2013)
Google Scholar
Basseville, M.: Distance measures for signal processing and pattern recognition. Sig. Process. 18, 349–9 (1989)
Article MathSciNet Google Scholar
Basseville, M.: Divergence measures for statistical data processingan annotated bibliography. Sig. Process. 93(4), 621–33 (2013)
Article MathSciNet Google Scholar
Bauer, D., Deistler, M.: Balanced canonical forms for system identification. IEEE Trans. Autom. Control 44(6), 1118–1131 (1999)
Google Scholar
Béjar, B., Zappella, L., Vidal, R.: Surgical gesture classification from video data. In: Medical Image Computing and Computer Assisted Intervention, pp. 34–41 (2012)
Google Scholar
Boets, J., Cock, K.D., Moor, B.D.: A mutual information based distance for multivariate Gaussian processes. In: Modeling, Estimation and Control, Festschrift in Honor of Giorgio Picci on the Occasion of his Sixty-Fifth Birthday, Lecture Notes in Control and Information Sciences, vol. 364, pp. 15–33. Springer, Berlin (2007)
Google Scholar
Bonnabel, S., Collard, A., Sepulchre, R.: Rank-preserving geometric means of positive semi-definite matrices. Linear Algebra. Its Appl. 438, 3202–16 (2013)
Article MATH MathSciNet Google Scholar
Byrnes, C.I., Hurt, N.: On the moduli of linear dynamical systems. In: Advances in Mathematical Studies in Analysis, vol. 4, pp. 83–122. Academic Press, New York (1979)
Google Scholar
Chaudhry, R., Vidal, R.: Recognition of visual dynamical processes: Theory, kernels and experimental evaluation. Technical Report 09–01. Department of Computer Science, Johns Hopkins University (2009)
Google Scholar
Chavel, I.: Riemannian Geometry: A Modern Introduction, vol. 98, 2nd edn. Cambridge University Press, Cambridge (2006)
Google Scholar
Cock, K.D., Moor, B.D.: Subspace angles and distances between ARMA models. Syst. Control Lett. 46(4), 265–70 (2002)
Article MATH Google Scholar
Corduas, M., Piccolo, D.: Time series clustering and classification by the autoregressive metric. Comput. Stat. Data Anal. 52(4), 1860–72 (2008)
Article MATH MathSciNet Google Scholar
Deistler, M., Anderson, B.O., Filler, A., Zinner, C., Chen, W.: Generalized linear dynamic factor models: an approach via singular autoregressions. Eur. J. Control 3, 211–24 (2010)
Article MathSciNet Google Scholar
Devroye, L.: A probabilistic Theory of Pattern Recognition, vol. 31. Springer, Berlin (1996)
Google Scholar
Doretto, G., Chiuso, A., Wu, Y., Soatto, S.: Dynamic textures. Int. J. Comput. Vision 51(2), 91–109 (2003)
Article MATH Google Scholar
Ferrante, A., Pavon, M., Ramponi, F.: Hellinger versus Kullback-Leibler multivariable spectrum approximation. IEEE Trans. Autom. Control 53(4), 954–67 (2008)
Article MathSciNet Google Scholar
Forni, M., Hallin, M., Lippi, M., Reichlin, L.: The generalized dynamic-factor model: Identification and estimation. Rev. Econ. Stat. 82(4), 540–54 (2000)
Article Google Scholar
Georgiou, T.T., Karlsson, J., Takyar, M.S.: Metrics for power spectra: an axiomatic approach. IEEE Trans. Signal Process. 57(3), 859–67 (2009)
Article MathSciNet Google Scholar
Gray, R., Buzo, A., Gray Jr, A., Matsuyama, Y.: Distortion measures for speech processing. IEEE Trans. Acoust. Speech Signal Process. 28(4), 367–76 (1980)
Article MATH Google Scholar
Gray, R.M.: Probability, Random Processes, and Ergodic Properties. Springer, Berlin (2009)
Google Scholar
Gray, R.M., Neuhoff, D.L., Shields, P.C.: A generalization of Ornstein’s \(\bar{d}\) distance with applications to information theory. The Ann. Probab. 3, 315–328 (1975)
Google Scholar
Gray Jr, A., Markel, J.: Distance measures for speech processing. IEEE Trans. Acoust. Speech Signal Process. 24(5), 380–91 (1976)
Article MathSciNet Google Scholar
Grenander, U.: Abstract Inference. Wiley, New York (1981)
Google Scholar
Hannan, E.J.: Multiple Time Series, vol. 38. Wiley, New York (1970)
Google Scholar
Hannan, E.J., Deistler, M.: The Statistical Theory of Linear Systems. Wiley, New York (1987)
Google Scholar
Hanzon, B.: Identifiability, Recursive Identification and Spaces of Linear Dynamical Systems, vol. 63–64. Centrum voor Wiskunde en Informatica (CWI), Amsterdam (1989)
Google Scholar
Hanzon, B., Marcus, S.I.: Riemannian metrics on spaces of stable linear systems, with applications to identification. In: IEEE Conference on Decision & Control, pp. 1119–1124 (1982)
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning. Springer, New York (2003)
Google Scholar
Hazewinkel, M.: Moduli and canonical forms for linear dynamical systems II: the topological case. Math. Syst. Theory 10, 363–85 (1977)
Article MATH MathSciNet Google Scholar
Helmke, U.: Balanced realizations for linear systems: a variational approach. SIAM J. Control Optim. 31(1), 1–15 (1993)
Google Scholar
Jiang, X., Ning, L., Georgiou, T.T.: Distances and Riemannian metrics for multivariate spectral densities. IEEE Trans. Autom. Control 57(7), 1723–35 (2012)
Article MathSciNet Google Scholar
Jimenez, N.D., Afsari, B., Vidal, R.: Fast Jacobi-type algorithm for computing distances between linear dynamical systems. In: European Control Conference (2013)
Google Scholar
Kailath, T.: Linear Systems. Prentice Hall, NJ (1980)
Google Scholar
Katayama, T.: Subspace Methods for System Identification. Springer, Berlin (2005)
Google Scholar
Kazakos, D., Papantoni-Kazakos, P.: Spectral distance measures between Gaussian processes. IEEE Trans. Autom. Control 25(5), 950–9 (1980)
Article MATH MathSciNet Google Scholar
Kendall, D.G., Barden, D., Carne, T.K., Le, H.: Shape and Shape Theory. Wiley Series In Probability And Statistics. Wiley, New York (1999)
Google Scholar
Kobayashi, S., Nomizu, K.: Foundations of Differential Geometry Volume I. Wiley Classics Library Edition. Wiley, New York (1963)
Google Scholar
Krishnaprasad, P.S.: Geometry of Minimal Systems and the Identification Problem. PhD thesis, Harvard University (1977)
Google Scholar
Krishnaprasad, P.S., Martin, C.F.: On families of systems and deformations. Int. J. Control 38(5), 1055–79 (1983)
Article MATH MathSciNet Google Scholar
Lee, J.M.: Introduction to Smooth Manifolds. Springer, Graduate Texts in Mathematics (2002)
Google Scholar
Liao, T.W.: Clustering time series data—a survey. Pattern Recogn. 38, 1857–74 (2005)
Article MATH Google Scholar
Makhoul, J.: Linear prediction: a tutorial review. Proc. IEEE 63(4), 561–80 (1975)
Article Google Scholar
Martin, A.: A metric for ARMA processes. IEEE Trans. Signal Process. 48(4), 1164–70 (2000)
Article MATH MathSciNet Google Scholar
Moor, B.D., Overschee, P.V., Suykens, J.: Subspace algorithms for system identification and stochastic realization. Technical Report ESAT-SISTA Report 1990–28, Katholieke Universiteit Leuven (1990)
Google Scholar
Moore, B.C.: Principal component analysis in linear systems: Controllability, observability, and model reduction. IEEE Trans. Autom. Control 26, 17–32 (1981)
Article MATH Google Scholar
Ning, L., Georgiou, T.T., Tannenbaum, A.: Matrix-valued Monge-Kantorovich optimal mass transport. arXiv, preprint arXiv:1304.3931 (2013)
Google Scholar
Nocerino, N., Soong, F.K., Rabiner, L.R., Klatt, D.H.: Comparative study of several distortion measures for speech recognition. Speech Commun. 4(4), 317–31 (1985)
Article Google Scholar
Ober, R.J.: Balanced realizations: canonical form, parametrization, model reduction. Int. J. Control 46(2), 643–70 (1987)
Article MATH MathSciNet Google Scholar
Papoulis, A., Pillai, S.U.: Probability, random variables and stochastic processes with errata sheet. McGraw-Hill Education, New York (2002)
Google Scholar
Piccolo, D.: A distance measure for classifying ARIMA models. J. Time Ser. Anal. 11(2), 153–64 (1990)
Article MATH Google Scholar
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall International, NJ (1993)
Google Scholar
Rao, M.M.: Stochastic Processes: Inference Theory, vol. 508. Springer, New York (2000)
Google Scholar
Ravichandran, A., Vidal, R.: Video registration using dynamic textures. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 158–171 (2011)
Google Scholar
Ravishanker, N., Melnick, E.L., Tsai, C.-L.: Differential geometry of ARMA models. J. Time Ser. Anal. 11(3), 259–274 (1990)
Article MATH MathSciNet Google Scholar
Rozanov, Y.A.: Stationary Random Processes. Holden-Day, San Francisco (1967)
Google Scholar
Vandereycken, B., Absil, P.-A., Vandewalle, S.: A Riemannian geometry with complete geodesics for the set of positive semi-definite matrices of fixed rank. Technical Report Report TW572, Katholieke Universiteit Leuven (2010)
Google Scholar
Vishwanathan, S., Smola, A., Vidal, R.: Binet-Cauchy kernels on dynamical systems and its application to the analysis of dynamic scenes. Int. J. Comput. Vision 73(1), 95–119 (2007)
Article Google Scholar
Youla, D.: On the factorization of rational matrices. IRE Trans. Inf. Theory 7(3), 172–189 (1961)
Article Google Scholar
Younes, L.: Shapes and Diffeomorphisms. In: Applied Mathematical Sciences, vol. 171. Springer, New York (2010)
Google Scholar

Download references

Acknowledgments

The authors are thankful to the anonymous reviewers for their insightful comments and suggestions, which helped to improve the quality of this paper. The authors also thank the organizers of the GSI 2013 conference and the editor of this book Prof. Frank Nielsen. This work was supported by the Sloan Foundation and by grants ONR N00014-09-10084, NSF 0941362, NSF 0941463, NSF 0931805, and NSF 1335035.

Author information

Authors and Affiliations

Center for Imaging Science, Johns Hopkins University, Baltimore, MD, 21218, USA
Bijan Afsari & René Vidal

Authors

Bijan Afsari
View author publications
You can also search for this author in PubMed Google Scholar
René Vidal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bijan Afsari .

Editor information

Editors and Affiliations

Laboratoire d'Informatique (LIX), Polytechnique School, Palaiseau Cedex, France
Frank Nielsen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Afsari, B., Vidal, R. (2014). Distances on Spaces of High-Dimensional Linear Stochastic Processes: A Survey. In: Nielsen, F. (eds) Geometric Theory of Information. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-05317-2_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-05317-2_8
Published: 09 May 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05316-5
Online ISBN: 978-3-319-05317-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics