Abstract
In this paper we study the geometrization of certain spaces of stochastic processes. Our main motivation comes from the problem of pattern recognition in high-dimensional time-series data (e.g., video sequence classification and clustering). In the first part of the paper, we provide a rather extensive review of some existing approaches to defining distances on spaces of stochastic processes. The majority of these distances are, in one way or another, based on comparing power spectral densities of the processes. In the second part, we focus on the space of processes generated by (stochastic) linear dynamical systems (LDSs) of fixed size and order, for which we recently introduced a class of group action induced distances called the alignment distances. This space is a natural choice in some pattern recognition applications and is also of great interest in control theory, where it is often convenient to represent LDSs in state-space form. In this case the space (more precisely manifold) of LDSs can be considered as the base space of a principal fiber bundle comprised of state-space realizations. This is due to a Lie group action symmetry present in the state-space representation of LDSs. The basic idea behind the alignment distance is to compare two LDSs by first aligning a pair of their realizations along the respective fibers. Upon a standardization (or bundle reduction) step this alignment process can be expressed as a minimization problem over orthogonal matrices, which can be solved efficiently. The alignment distance differs from most existing distances in that it is a structural or generative distance, since in some sense it compares how two processes are generated. We also briefly discuss averaging LDSs using the alignment distance via minimizing a sum of the squares of distances (namely, the so-called Fréchet mean).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Note that in a different or more general setting the noise at the output could be a process \(\varvec{w}_t\) different (independent) from the input noise \(\varvec{v}_t\). This does not cause major changes in our developments. Since the output noise usually represents a perturbation which cannot be modeled, as far as Problem 1 is concerned, one could usually assume that \(D_i=0\).
- 3.
Note that we are not implying that ARMA models are incapable of modeling such time series. Rather the issue is that general or unrestricted ARMA models suffer from the curse of dimensionality in the identification problem, and the parametrization of a restricted class of ARMA models with a small number of parameters is complicated [20]. However, at the same time, by using state-space models it is easier to overcome the curse of dimensionality and this approach naturally leads to simple and effective identification algorithms [20, 22].
- 4.
Strictly speaking, in order to be the PSD matrix of a regular stationary process, a matrix function on \([0,2\pi ]\) must satisfy other mild technical conditions (see [62] for details).
- 5.
In fact, our approach (in Sects. 8.3–8.5) is also based on the idea of comparing the minimum phase (i.e., canonical) filters or factors in the case of processes with rational spectra. However, instead of comparing the associated transfer functions or impulse responses, we try to compare the associated state-space realizations (in a specific sense). This approach, therefore, is in some sense structural or generative, since it tries to compare how the processes are generated (according to the state-space representation) and the model order plays an explicit role in it.
- 6.
Notice that defining distances between probability densities in the time domain is a more general approach than the PSD-based approaches, and it can be employed in the case of nonstationary as well as non-Gaussian processes. However, such an approach, in general, is computationally difficult.
- 7.
- 8.
It is interesting to note that by a simple modification some of the spectral-ratio based distances can attain this property, e.g., by modifying \(d_\mathrm{R }\) in (8.8) as \(d_\mathrm{RI }^2({\varvec{y}}^1,{\varvec{y}}^2)=\int \big (\log \big (\frac{P_{{\varvec{y}}^1}}{P_{{\varvec{y}}^2}}\big )\big )^2\mathrm d \omega -\big (\int \log \big (\frac{P_{{\varvec{y}}^1}}{P_{{\varvec{y}}^2}}\big )\mathrm d \omega \big )^2\) (see also [9, 25, 49]).
- 9.
This and the results in [53] underline the fact that defining distances on \(\mathcal {P}_p\) for \(p>1\) may be challenging, not only from a computational point of view but also from a theoretical one. In particular, certain nice properties in 1D do not automatically carry over to higher dimensions by a simple extension of the definitions in 1D.
- 10.
It is crucial to have in mind that we explicitly distinguish between the LDS, \(M\), and its realization \(R\), which is not unique. As it will become clear soon, an LDS has an equivalent class of realizations.
- 11.
These rank conditions, interestingly, have differential geometric significance in yielding nice quotient spaces, see Sect. 8.4.
- 12.
Strictly speaking \(\bullet \) is a right action; however, it is notationally convenient to write it as a left action in (8.12).
- 13.
We may call this an alignment distance. However, based on the same principle in Sect. 8.5 we define another group action induced distance, which we explicitly call the alignment distance. Since our main object of interest is that distance, we prefer not to call the distance in (8.13) an alignment distance.
- 14.
It is interesting to note that some of the good properties of the \(k\)-nearest neighborhood algorithms on a general metric space depend on the triangle inequality [21].
- 15.
This problem, in general, is difficult, among other things, because it is a non-convex (infinite-dimensional) variational problem. Recall that in Riemannian geometry the non-convexity of the arc length variational problem can be related to the non-trivial topology of the manifold (see e.g., [17]).
References
Afsari, B., Chaudhry, R., Ravichandran, A., Vidal, R.: Group action induced distances for averaging and clustering linear dynamical systems with applications to the analysis of dynamic visual scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
Afsari, B., Vidal, R.: The alignment distance on spaces of linear dynamical systems. In: IEEE Conference on Decision and Control (2013)
Afsari, B., Vidal, R.: Group action induced distances on spaces of high-dimensional linear stochastic processes. In: Geometric Science of Information, LNCS, vol. 8085, pp. 425–432 (2013)
Amari, S.I.: Differential geometry of a parametric family of invertible linear systems-Riemannian metric, dual affine connections, and divergence. Math. Syst. Theory 20, 53–82 (1987)
Amari, S.I., Nagaoka, H.: Methods of information geometry. In: Translations of Mathematical Monographs, vol. 191. American Mathematical Society, Providence (2000)
Anderson, B.D., Deistler, M.: Properties of zero-free spectral matrices. IEEE Trans. Autom. Control 54(10), 2365–5 (2009)
Aoki, M.: State Space Modeling of Time Series. Springer, Berlin (1987)
Barbaresco, F.: Information geometry of covariance matrix: Cartan-Siegel homogeneous bounded domains, Mostow/Berger fibration and Frechet median. In: Matrix Information Geometry, pp. 199–255. Springer, Berlin (2013)
Basseville, M.: Distance measures for signal processing and pattern recognition. Sig. Process. 18, 349–9 (1989)
Basseville, M.: Divergence measures for statistical data processingan annotated bibliography. Sig. Process. 93(4), 621–33 (2013)
Bauer, D., Deistler, M.: Balanced canonical forms for system identification. IEEE Trans. Autom. Control 44(6), 1118–1131 (1999)
Béjar, B., Zappella, L., Vidal, R.: Surgical gesture classification from video data. In: Medical Image Computing and Computer Assisted Intervention, pp. 34–41 (2012)
Boets, J., Cock, K.D., Moor, B.D.: A mutual information based distance for multivariate Gaussian processes. In: Modeling, Estimation and Control, Festschrift in Honor of Giorgio Picci on the Occasion of his Sixty-Fifth Birthday, Lecture Notes in Control and Information Sciences, vol. 364, pp. 15–33. Springer, Berlin (2007)
Bonnabel, S., Collard, A., Sepulchre, R.: Rank-preserving geometric means of positive semi-definite matrices. Linear Algebra. Its Appl. 438, 3202–16 (2013)
Byrnes, C.I., Hurt, N.: On the moduli of linear dynamical systems. In: Advances in Mathematical Studies in Analysis, vol. 4, pp. 83–122. Academic Press, New York (1979)
Chaudhry, R., Vidal, R.: Recognition of visual dynamical processes: Theory, kernels and experimental evaluation. Technical Report 09–01. Department of Computer Science, Johns Hopkins University (2009)
Chavel, I.: Riemannian Geometry: A Modern Introduction, vol. 98, 2nd edn. Cambridge University Press, Cambridge (2006)
Cock, K.D., Moor, B.D.: Subspace angles and distances between ARMA models. Syst. Control Lett. 46(4), 265–70 (2002)
Corduas, M., Piccolo, D.: Time series clustering and classification by the autoregressive metric. Comput. Stat. Data Anal. 52(4), 1860–72 (2008)
Deistler, M., Anderson, B.O., Filler, A., Zinner, C., Chen, W.: Generalized linear dynamic factor models: an approach via singular autoregressions. Eur. J. Control 3, 211–24 (2010)
Devroye, L.: A probabilistic Theory of Pattern Recognition, vol. 31. Springer, Berlin (1996)
Doretto, G., Chiuso, A., Wu, Y., Soatto, S.: Dynamic textures. Int. J. Comput. Vision 51(2), 91–109 (2003)
Ferrante, A., Pavon, M., Ramponi, F.: Hellinger versus Kullback-Leibler multivariable spectrum approximation. IEEE Trans. Autom. Control 53(4), 954–67 (2008)
Forni, M., Hallin, M., Lippi, M., Reichlin, L.: The generalized dynamic-factor model: Identification and estimation. Rev. Econ. Stat. 82(4), 540–54 (2000)
Georgiou, T.T., Karlsson, J., Takyar, M.S.: Metrics for power spectra: an axiomatic approach. IEEE Trans. Signal Process. 57(3), 859–67 (2009)
Gray, R., Buzo, A., Gray Jr, A., Matsuyama, Y.: Distortion measures for speech processing. IEEE Trans. Acoust. Speech Signal Process. 28(4), 367–76 (1980)
Gray, R.M.: Probability, Random Processes, and Ergodic Properties. Springer, Berlin (2009)
Gray, R.M., Neuhoff, D.L., Shields, P.C.: A generalization of Ornstein’s \(\bar{d}\) distance with applications to information theory. The Ann. Probab. 3, 315–328 (1975)
Gray Jr, A., Markel, J.: Distance measures for speech processing. IEEE Trans. Acoust. Speech Signal Process. 24(5), 380–91 (1976)
Grenander, U.: Abstract Inference. Wiley, New York (1981)
Hannan, E.J.: Multiple Time Series, vol. 38. Wiley, New York (1970)
Hannan, E.J., Deistler, M.: The Statistical Theory of Linear Systems. Wiley, New York (1987)
Hanzon, B.: Identifiability, Recursive Identification and Spaces of Linear Dynamical Systems, vol. 63–64. Centrum voor Wiskunde en Informatica (CWI), Amsterdam (1989)
Hanzon, B., Marcus, S.I.: Riemannian metrics on spaces of stable linear systems, with applications to identification. In: IEEE Conference on Decision & Control, pp. 1119–1124 (1982)
Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning. Springer, New York (2003)
Hazewinkel, M.: Moduli and canonical forms for linear dynamical systems II: the topological case. Math. Syst. Theory 10, 363–85 (1977)
Helmke, U.: Balanced realizations for linear systems: a variational approach. SIAM J. Control Optim. 31(1), 1–15 (1993)
Jiang, X., Ning, L., Georgiou, T.T.: Distances and Riemannian metrics for multivariate spectral densities. IEEE Trans. Autom. Control 57(7), 1723–35 (2012)
Jimenez, N.D., Afsari, B., Vidal, R.: Fast Jacobi-type algorithm for computing distances between linear dynamical systems. In: European Control Conference (2013)
Kailath, T.: Linear Systems. Prentice Hall, NJ (1980)
Katayama, T.: Subspace Methods for System Identification. Springer, Berlin (2005)
Kazakos, D., Papantoni-Kazakos, P.: Spectral distance measures between Gaussian processes. IEEE Trans. Autom. Control 25(5), 950–9 (1980)
Kendall, D.G., Barden, D., Carne, T.K., Le, H.: Shape and Shape Theory. Wiley Series In Probability And Statistics. Wiley, New York (1999)
Kobayashi, S., Nomizu, K.: Foundations of Differential Geometry Volume I. Wiley Classics Library Edition. Wiley, New York (1963)
Krishnaprasad, P.S.: Geometry of Minimal Systems and the Identification Problem. PhD thesis, Harvard University (1977)
Krishnaprasad, P.S., Martin, C.F.: On families of systems and deformations. Int. J. Control 38(5), 1055–79 (1983)
Lee, J.M.: Introduction to Smooth Manifolds. Springer, Graduate Texts in Mathematics (2002)
Liao, T.W.: Clustering time series data—a survey. Pattern Recogn. 38, 1857–74 (2005)
Makhoul, J.: Linear prediction: a tutorial review. Proc. IEEE 63(4), 561–80 (1975)
Martin, A.: A metric for ARMA processes. IEEE Trans. Signal Process. 48(4), 1164–70 (2000)
Moor, B.D., Overschee, P.V., Suykens, J.: Subspace algorithms for system identification and stochastic realization. Technical Report ESAT-SISTA Report 1990–28, Katholieke Universiteit Leuven (1990)
Moore, B.C.: Principal component analysis in linear systems: Controllability, observability, and model reduction. IEEE Trans. Autom. Control 26, 17–32 (1981)
Ning, L., Georgiou, T.T., Tannenbaum, A.: Matrix-valued Monge-Kantorovich optimal mass transport. arXiv, preprint arXiv:1304.3931 (2013)
Nocerino, N., Soong, F.K., Rabiner, L.R., Klatt, D.H.: Comparative study of several distortion measures for speech recognition. Speech Commun. 4(4), 317–31 (1985)
Ober, R.J.: Balanced realizations: canonical form, parametrization, model reduction. Int. J. Control 46(2), 643–70 (1987)
Papoulis, A., Pillai, S.U.: Probability, random variables and stochastic processes with errata sheet. McGraw-Hill Education, New York (2002)
Piccolo, D.: A distance measure for classifying ARIMA models. J. Time Ser. Anal. 11(2), 153–64 (1990)
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall International, NJ (1993)
Rao, M.M.: Stochastic Processes: Inference Theory, vol. 508. Springer, New York (2000)
Ravichandran, A., Vidal, R.: Video registration using dynamic textures. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 158–171 (2011)
Ravishanker, N., Melnick, E.L., Tsai, C.-L.: Differential geometry of ARMA models. J. Time Ser. Anal. 11(3), 259–274 (1990)
Rozanov, Y.A.: Stationary Random Processes. Holden-Day, San Francisco (1967)
Vandereycken, B., Absil, P.-A., Vandewalle, S.: A Riemannian geometry with complete geodesics for the set of positive semi-definite matrices of fixed rank. Technical Report Report TW572, Katholieke Universiteit Leuven (2010)
Vishwanathan, S., Smola, A., Vidal, R.: Binet-Cauchy kernels on dynamical systems and its application to the analysis of dynamic scenes. Int. J. Comput. Vision 73(1), 95–119 (2007)
Youla, D.: On the factorization of rational matrices. IRE Trans. Inf. Theory 7(3), 172–189 (1961)
Younes, L.: Shapes and Diffeomorphisms. In: Applied Mathematical Sciences, vol. 171. Springer, New York (2010)
Acknowledgments
The authors are thankful to the anonymous reviewers for their insightful comments and suggestions, which helped to improve the quality of this paper. The authors also thank the organizers of the GSI 2013 conference and the editor of this book Prof. Frank Nielsen. This work was supported by the Sloan Foundation and by grants ONR N00014-09-10084, NSF 0941362, NSF 0941463, NSF 0931805, and NSF 1335035.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Afsari, B., Vidal, R. (2014). Distances on Spaces of High-Dimensional Linear Stochastic Processes: A Survey. In: Nielsen, F. (eds) Geometric Theory of Information. Signals and Communication Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-05317-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-05317-2_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05316-5
Online ISBN: 978-3-319-05317-2
eBook Packages: EngineeringEngineering (R0)