A scalable approach to spectral clustering with SDD solvers

Khoa, Nguyen Lu Dang; Chawla, Sanjay

doi:10.1007/s10844-013-0285-0

A scalable approach to spectral clustering with SDD solvers

Published: 24 October 2013

Volume 44, pages 289–308, (2015)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Nguyen Lu Dang Khoa¹ &
Sanjay Chawla²

314 Accesses
6 Citations
Explore all metrics

Abstract

The promise of spectral clustering is that it can help detect complex shapes and intrinsic manifold structure in large and high dimensional spaces. The price for this promise is the expensive computational cost for computing the eigen-decomposition of the graph Laplacian matrix—so far a necessary subroutine for spectral clustering. In this paper we bypass the eigen-decomposition of the original Laplacian matrix by leveraging the recently introduced near-linear time solver for symmetric diagonally dominant (SDD) linear systems and random projection. Experiments on several synthetic and real datasets show that the proposed approach has better clustering quality and is faster than the state-of-the-art approximate spectral clustering methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Achlioptas, D. (2001). Database-friendly random projections. In Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART symposium on Principles of Database Systems, PODS ’01 (pp. 274–281). New York: ACM.
Chapter Google Scholar
Chen, X., & Cai, D. (2011). Large scale spectral clustering with landmark-based representation. In Twenty-Fifth AAAI Conference on Artificial Intelligence (pp. 313–318).
Chen, W.Y., Song, Y., Bai, H., Lin, C.J., Chang, E.Y. (2011). Parallel spectral clustering in distributed systems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(3), 568–586.
Article Google Scholar
de Vries, T., Chawla, S., Houle, M.E. (2012). Density-preserving projections for large-scale local anomaly detection. Knowledge and Information Systems, 32(1), 25–52.
Article Google Scholar
Donath, W.E., & Hoffman, A.J. (1973). Lower bounds for the partitioning of graphs. IBM Journal of Research and Development, 17, 420–425.
Article MATH MathSciNet Google Scholar
Doyle, P.G., & Snell, J.L. (1984). Random walks and electric networks. Washington, DC: Mathematical Association of America.
MATH Google Scholar
Fiedler, M. (1973). Algebraic connectivity of graphs. Czechoslovak Mathematical Journal, 23, 298–305.
MathSciNet Google Scholar
Fouss, F., & Renders, J.M. (2007). Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Transaction on Knowledge and Data Engineering, 19(3), 355–369.
Article Google Scholar
Fowlkes, C., Belongie, S., Chung, F., Malik, J. (2004). Spectral grouping using the Nyström method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 214–225.
Article Google Scholar
Frank, A., & Asuncion, A. (2010). UCI machine learning repository. URL: http://archive.ics.uci.edu/ml. Accessed 31 Jan 2013
Golub, G.H., & Van Loan, C.F. (1996). Matrix computations (3rd edn.). Baltimore: Johns Hopkins University Press.
MATH Google Scholar
Johnson, W., & Lindenstrauss, J. (1984). Extensions of Lipschitz mappings into a Hilbert space. In Conference in modern analysis and probability (New Haven, Conn., 1982), Contemporary Mathematics (Vol. 26, pp. 189–206). American Mathematical Society.
Jolliffe, I.T. (2002). Principal component analysis (2nd edn.). Springer.
Koutis, I., Miller, G.L., Tolliver, D. (2009). Combinatorial preconditioners and multilevel solvers for problems in computer vision and image processing. In Proceedings of the 5th international symposium on advances in visual computing: Part I, ISVC ’09 (pp. 1067–1078). Berlin, Heidelberg: Springer.
Koutis, I., Miller, G., Peng, R. (2010). Approaching optimality for solving sdd linear systems. In 2010 51st annual IEEE symposium on Foundations of Computer Science (FOCS) (pp. 235–244).
Koutis, I., Miller, G.L., Peng, R. (2011). A nearly-m log n time solver for sdd linear systems. In Proceedings of the 2011 IEEE 52nd annual symposium on Foundations of Computer Science, FOCS ’11 (pp. 590–598). Washington, DC: IEEE Computer Society.
Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and Computing, 17(4), 395–416.
Article MathSciNet Google Scholar
Luxburg, U.V., Bousquet, O., Belkin, M. (2004). On the convergence of spectral clustering on random samples: The normalized case. In Proceedings of the 17th annual Conference on Learning Theory (COLT) (pp. 457–471). Springer.
von Luxburg, U., Radl, A., Hein, M. (2010). Getting lost in space: Large sample analysis of the resistance distance. In NIPS (pp. 2622–2630).
Mavroeidis, D. (2010). Accelerating spectral clustering with partial supervision. Data Mining and Knowledge Discovery, 21, 241–258.
Article MathSciNet Google Scholar
Ng, A.Y., Jordan, M.I., Weiss, Y. (2001). On spectral clustering: Analysis and an algorithm. In Advances in neural information processing systems (pp. 849–856). MIT Press.
Qiu, H., & Hancock, E. (2007). Clustering and embedding using commute times. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(11), 1873–1890.
Article Google Scholar
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 888–905.
Article Google Scholar
Spielman, D.A., & Srivastava, N. (2008). Graph sparsification by effective resistances. In Proceedings of the 40th annual ACM Symposium on Theory of Computing, STOC ’08 (pp. 563–568). New York: ACM.
Spielman, D.A., & Teng, S.H. (2004). Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proceedings of the 36th annual ACM Symposium on Theory of Computing, STOC ’04 (pp. 81–90). New York: ACM.
Spielman, D.A., & Teng, S.H. (2006). Nearly-linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems. CoRR abs/cs/0607105.
Vaidya, P. (1991). Solving linear equations with symmetric diagonally dominant matrices by constructing good preconditioners. A Talk Based on this Manuscript was Presented at the IMA Workshop on Graph Theory and Sparse Matrix Computation, October 1991, Minneapolis.
Venkatasubramanian, S., & Wang, Q. (2011). The Johnson–Lindenstrauss transform: An empirical study. In M. Müller-Hannemann, R.F.F. Werneck (Eds.), ALENEX (pp. 164–173). SIAM.
Wang, L., Leckie, C., Ramamohanarao, K., Bezdek, J. (2009). Approximate spectral clustering. In Proceedings of the 13th Pacific-Asia conference on advances in knowledge discovery and data mining, PAKDD ’09 (pp. 134–146). Berlin Heidelberg: Springer.
Yan, D., Huang, L., Jordan, M.I. (2009). Fast approximate spectral clustering. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge Discovery and Data mining, KDD ’09 (pp. 907–916). New York: ACM.

Download references

Author information

Authors and Affiliations

National ICT Australia (NICTA), Sydney, Australia
Nguyen Lu Dang Khoa
School of IT, University of Sydney, Sydney, Australia
Sanjay Chawla

Authors

Nguyen Lu Dang Khoa
View author publications
You can also search for this author in PubMed Google Scholar
Sanjay Chawla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nguyen Lu Dang Khoa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khoa, N.L.D., Chawla, S. A scalable approach to spectral clustering with SDD solvers. J Intell Inf Syst 44, 289–308 (2015). https://doi.org/10.1007/s10844-013-0285-0

Download citation

Received: 31 January 2013
Revised: 22 August 2013
Accepted: 04 October 2013
Published: 24 October 2013
Issue Date: April 2015
DOI: https://doi.org/10.1007/s10844-013-0285-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A scalable approach to spectral clustering with SDD solvers

Abstract

Access this article

Similar content being viewed by others

Large-Scale Spectral Clustering with Stochastic Nyström Approximation

Improved spectral clustering based on Nyström method

Performance Evaluation of Regular Decomposition and Benchmark Clustering Methods

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Large-Scale Spectral Clustering with Stochastic Nyström Approximation

Improved spectral clustering based on Nyström method

Performance Evaluation of Regular Decomposition and Benchmark Clustering Methods

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation