Abstract
The contribution exposes and illustrates a general, flexible formalism, together with an associated iterative procedure, aimed at determining soft memberships of marked nodes in a weighted network. Gathering together spatial entities which are both spatially close and similar regarding their features is an issue relevant in image segmentation, spatial clustering, and data analysis in general. Unoriented weighted networks are specified by an “exchange matrix”, determining the probability to select a pair of neighbors. We present a family of membership-dependent free energies, whose local minimization specifies soft clusterings. The free energy additively combines a mutual information, as well as various energy terms, concave or convex in the memberships: within-group inertia, generalized cuts (extending weighted Ncut and modularity), and membership discontinuities (generalizing Dirichlet forms). The framework is closely related to discrete Markov models, random walks, label propagation and spatial autocorrelation (Moran’s I), and can express the Mumford-Shah approach. Four small datasets illustrate the theory.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Besides the generalized discontinuity functionals, already addressed in the proceedings, but unfortunately referred there to as “cut functionals”.
References
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 888–905 (2000)
Grady, L., Schwartz, E.L.: Isoperimetric graph partitioning for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 28, 469–475 (2006)
Newman, M.E.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103, 8577–8582 (2006)
Berger, J., Snell, J.L.: On the concept of equal exchange. Syst. Res. Behav. Sci. 2, 111–118 (1957)
Ceré, R., Bavaud, F.: Multi-labelled image segmentation in irregular, weighted networks: a spatial autocorrelation approach. In: Proceedings of the 3rd International Conference on Geographical Information Systems Theory, Applications and Management, Scitepress, pp. 62–69 (2017)
Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Technical report CMU-CALD-02-107, Carnegie Mellon University (2002)
Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures inlarge-scale networks. Phys. Rev. E 76, 036106 (2007)
von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)
Bavaud, F.: Aggregation invariance in general clustering approaches. Adv. Data Anal. Classif. 3, 205–225 (2009)
Bavaud, F.: On the Schoenberg transformations in data analysis: theory and illustrations. J. Classif. 28, 297–314 (2011)
Rose, K., Gurewitz, E., Fox, G.: A deterministic annealing approach to clustering. Pattern Recogn. Lett. 11, 589–594 (1990)
Mumford, D., Shah, J.: Optimal approximations by piecewise smooth functions and associated variational problems. Commun. Pure Appl. Math. 42, 577–685 (1989)
Petitot, J.: An introduction to the Mumford-Shah segmentation model. J. Physiol.-Paris 97, 335–342 (2003)
Vitti, A.: The Mumford-Shah variational model for image segmentation: an overview of the theory, implementation and use. ISPRS J. Photogramm. Remote. Sens. 69, 50–64 (2012)
Gaumer, P., Moliterni, C.: Dictionnaire mondial de la bande dessinée. Larousse, Paris (1994)
Couprie, C., Grady, L., Najman, L., Talbot, H.: Power watershed: a unifying graph-based optimization framework. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1384–1399 (2011)
Fouss, F., Saerens, M., Shimbo, M.: Algorithms and Models for Network Data and Link Analysis. Cambridge University Press, Cambridge (2016)
Bavaud, F., Guex, G.: Interpolating between random walks and shortest paths: a path functional approach. In: Aberer, K., Flache, A., Jager, W., Liu, L., Tang, J., Guéret, C. (eds.) SocInfo 2012. LNCS, vol. 7710, pp. 68–81. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35386-4_6
Françoisse, K., Kivimäki, I., Mantrach, A., Rossi, F., Saerens, M.: A bag-of-paths framework for network data analysis. arXiv preprint arXiv:1302.6766 (2013)
Devooght, R., Mantrach, A., Kivimäki, I., Bersini, H., Jaimes, A., Saerens, M.: Random walks based modularity: application to semi-supervised learning. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 213–224. ACM (2014)
Sinop, A.K., Grady, L.: A seeded image segmentation framework unifying graph cuts and random walker which yields a new algorithm. In: 2007 IEEE 11th International Conference on Computer Vision. ICCV 2007, pp. 1–8. IEEE (2007)
Guex, G.: Interpolating between random walks and optimal transportation routes: flow with multiple sources and targets. Phys. A: Stat. Mech. Appl. 450, 264–277 (2016)
Doyle, P.G., Snell, J.L.: Random Walks and Electric Networks. Mathematical Association of America, Washington D.C. (1984)
Grady, L.: Random walks for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1768–1783 (2006)
Critchley, F., Fichet, B.: The partial order by inclusion of the principal classes of dissimilarity on a finite set, and some of their basic properties. In: Van Cutsem, B. (ed.) Classification and Dissimilarity Analysis. LNS, vol. 93, pp. 5–65. Springer, New York (1994). https://doi.org/10.1007/978-1-4612-2686-4_2
Bivand, R.S., Pebesma, E.J., Gomez-Rubio, V., Pebesma, E.J.: ApplieD Spatial Data Analysis with R, vol. 747248717. Springer, New York (2008). https://doi.org/10.1007/978-0-387-78171-6
LeSage, J.P.: An introduction to spatial econometrics. Revue d’économie industrielle 19–44 (2008). Field number 123
Arbia, G.: Spatial Data Configuration in Statistical Analysis of rEgional Economic and Related Problems, vol. 14. Springer, Dordrecht (2012). https://doi.org/10.1007/978-94-009-2395-9
Anselin, L.: Spatial Econometrics: Methods and Models, vol. 4. Springer, Dordrech (2013). https://doi.org/10.1007/978-94-015-7799-1
Griffith, D.A.: Spatial Autocorrelation and Spatial Filtering: Gaining Understanding Through Theory and Scientific Visualization. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-540-24806-4
Besag, J.: On the statistical analysis of dirty pictures. J. R. Stat. Soc. Ser. B (Methodol.) 48, 259–302 (1986)
Greig, D.M., Porteous, B.T., Seheult, A.H.: Exact maximum a posteriori estimation for binary images. J. R. Stat. Soc. Ser. B (Methodol.) 51, 271–279 (1989)
Anselin, L.: Local indicators of spatial association - LISA. Geogr. Anal. 27, 93–115 (1995)
White, S., Smyth, P.: A spectral clustering approach to finding communities in graphs. In: Proceedings of the 2005 SIAM International Conference on Data Mining, pp. 274–285. SIAM (2005)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856 (2002)
Lebichot, B., Saerens, M.: An experimental study of graph-based semi-supervised classification with additional node information. arXiv preprint arXiv:1705.08716 (2017)
Yen, L., Saerens, M., Mantrach, A., Shimbo, M.: A family of dissimilarity measures between nodes generalizing both the shortest-path and the commute-time distances. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD 2008, pp. 785–793. ACM, New York (2008)
Bavaud, F.: Euclidean distances, soft and spectral clustering on weighted graphs. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6321, pp. 103–118. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15880-3_13
Kivimäki, I., Shimbo, M., Saerens, M.: Developments in the theory of randomized shortest paths with a comparison of graph node distances. Phys. A: Stat. Mech. Appl. 393, 600–616 (2014)
Bavaud, F.: Testing spatial autocorrelation in weighted networks: the modes permutation test. J. Geogr. Syst. 3, 233–247 (2013)
Cliff, A.D., Ord, J.K.: Spatial Processes: Models & Applications. Taylor & Francis, Didcot (1981)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
1.1 Computing the Exchange Matrix E(A, f)
Defining an exchange matrix E both weight-compatible (that is obeying \(E\mathbf{1}=f\), where the regional weights f are given) and reflecting the spatial structure contained in the binary adjacency matrix \(A=(a_{ij})\) is a crucial, necessary step in the “ZED formalism” under consideration. Two constructions, not trivial, nor that difficult either, have been investigated in this paper, namely the diffusive specification and the Metropolis-Hastings specification.
The Diffusive Exchange Matrix. Consider a time-continuous Markov chain W on the n pixels, whose infinitesimal generator or rate matrix is proportional to the adjacency matrix A, and conveniently normalized so that f constitutes the stationary distribution of W. The resulting exchange matrix \(E=\varPi W\) turns out to be symmetric and p.s.d., and given by
where \(\varPi = \text{ diag }(f)\), and
LA is the Laplacian of matrix A, and matrix exponentiation (20) can be carried out by the spectral decomposition of \(\varPsi \). Specification (20) describes a diffusive process at time \(t>0\), with limits \(\lim _{t\rightarrow 0} E(A, f, t)=\varPi \) (“frozen network”, consisting of n isolated nodes: spatial autarchy), and \(\lim _{t\rightarrow \infty } E(A, f, t)=ff'\) (“complete network”, with independent selection of the node pairs: complete mobility). Identity \(\text{ trace }(E(t)) = 1-t + 0(t^2)\) shows t to measure, for \(t\ll 1\), the proportion of distinct regional pairs in the joint distribution E.
The Metropolis-Hastings Exchange Matrix. The natural random walk with Markov transition matrix \(a_{ij}/a_{i\bullet }\) correctly describes the spatial structure of the network, but its stationary distribution is \(g_i=a_{i\bullet }/a_{\bullet \bullet }\) instead of \(f_i\). Applying the Metropolis-Hastings algorithm defines a recalibrated random walk with stationary distribution f, ending up in a weight-compatible exchange matrix of the form:
and \((LB)_{ij} = \delta _{ij} \, b_{i\bullet } - b_{ij} \) is the Laplacian of B. Expression (21) does not require spectral decomposition, and its computation is much faster than (20) for increasing n (Fig. 15). However, E in (21) is not p.s.d in general, thus threatening the concavity of \(\mathcal{C}^\kappa [Z]\) (Sect. 5.1).
1.2 Testing Spatial Autocorrelation
Under the null hypothesis \(H_0\) of stationarity and absence of spatial autocorrelation, univariate features are independent, and follow a distribution with common mean and variance inversely proportional to the size of the region, namely \(E(X_{ik})=\mu _k\) and \(\text{ Cov }(X_{ik},X_{jk})=\delta _{ij}\sigma ^2_k/f_i\) [40]. Under normal approximation, the expected value of the multivariate Moran’s I (2) reads
and its the variance reads
Spatial autocorrelation is thus significant at level \(\alpha \) if \(z= |I - E_0(I)|/ \sqrt{\text{ Var }_0}(I)\, \ge \, u_{1-\frac{\alpha }{2}}\), where \(u_{1-\frac{\alpha }{2}}\) is the quantile of the standard normal distribution.
Alternatively, a permutation test can be performed (e.g. [41]), by generating a series of values \(\hat{I}\) of the transformed Moran index, where \(\hat{I}\) obtains as (2) with replaced by . The plain specification, which consists in replacing the profile \(x_{ik}\) of region i by the profile \(\hat{x}_{ik}=x_{\pi (i)k}\) of another region \(\pi (i)\) (where \(\pi \) denotes a permutation), that is in defining \(\hat{D}_{ij}=D_{\pi (i),\pi (j)}\), is somehow flawed in the weighted case, in view of the heteroscedasticity of the distribution of \(X_{ik}\). Instead, the quantities \(\sqrt{f_i}(x_{ik}-\bar{x}_k)\) (with \(\bar{x}_k=\sum _i f_i x_{ik}\)) for \(i=1,\ldots , n\) are expected to follow the same distribution under \(H_0\), thus insuring the validity of the weight-corrected specification, with (see Fig. 2)
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ceré, R., Bavaud, F. (2019). Soft Image Segmentation: On the Clustering of Irregular, Weighted, Multivariate Marked Networks. In: Ragia, L., Laurini, R., Rocha, J. (eds) Geographical Information Systems Theory, Applications and Management. GISTAM 2017. Communications in Computer and Information Science, vol 936. Springer, Cham. https://doi.org/10.1007/978-3-030-06010-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-06010-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-06009-1
Online ISBN: 978-3-030-06010-7
eBook Packages: Computer ScienceComputer Science (R0)