Skip to main content

Performance Evaluation of Constraints in Graph-Based Semi-supervised Clustering

  • Conference paper
Active Media Technology (AMT 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6335))

Included in the following conference series:

Abstract

Semi-supervised learning has been attracting much interest to cope with vast amount of data. When similarities among instances are specified, by connecting each pair of instances with an edge, the entire data can be represented as an edge-weighted graph. Based on the graph representation, we have proposed a graph-based approach for semi-supervised clustering, which modifies the graph structure by contraction in graph theory and graph Laplacian in spectral graph theory. In this paper we conduct extensive experiments over various document datasets and report its performance evaluation, with respect to the type of constraints as well as the number of constraints. We also compare it with other state of the art methods in terms of accuracy and running time, and the results are encouraging. Especially, our approach can leverage small amount of pairwise constraints to increase the performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Basu, S., Bilenko, M., Mooney, R.J.: A probabilistic framework for semi-supervised clustering. In: KDD 2004, pp. 59–68 (2004)

    Google Scholar 

  2. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with to-training. In: Proceedings of 11th Computational Learning Theory, pp. 92–100 (1998)

    Google Scholar 

  3. Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006)

    Google Scholar 

  4. Chung, F.: Spectral Graph Theory. American Mathematical Society, Providence (1997)

    MATH  Google Scholar 

  5. Cover, T., Thomas, J.: Elements of Information Theory. Wiley, Chichester (2006)

    MATH  Google Scholar 

  6. Dhillon, J., Mallela, S., Modha, D.: Information-theoretic co-clustering. In: Proc. of KDD 2003, pp. 89–98 (2003)

    Google Scholar 

  7. Diestel, R.: Graph Theory. Springer, Heidelberg (2006)

    Google Scholar 

  8. Girolami, M.: Mercer kernel-based clustering in feature space. IEEE Transactions on Neural Networks 13(3), 780–784 (2002)

    Article  Google Scholar 

  9. Goldman, S., Zhou, Y.: Enhancing supervised learning with unlabeled data. In: Proceedings of ICML 2000, pp. 327–334 (2000)

    Google Scholar 

  10. Guënoche, A., Hansen, P., Jaumard, B.: Efficient algorithms for divisive hierarchical clustering with the diameter criterion. J. of Classification 8, 5–30 (1991)

    Article  MATH  Google Scholar 

  11. Li, Z., Liu, J., Tang, X.: Pairwise constraint propagation by semidefinite programming for semi-supervised classification. In: ICML 2008, pp. 576–583 (2008)

    Google Scholar 

  12. Ogino, H., Yoshida, T.: Toward improving re-coloring based clustering with graph b-coloring. In: Proceedings of PRICAI 2010 (2010) (accepted)

    Google Scholar 

  13. Strehl, A., Ghosh, J.: Cluster Ensembles -A Knowledge Reuse Framework for Combining Multiple Partitions. J. Machine Learning Research 3(3), 583–617 (2002)

    Article  MathSciNet  Google Scholar 

  14. Sugato Basu, I.D., Wagstaff, K. (eds.): Constrained Clustering: Advances in Algorithms, Theory, and Applications. Chapman & Hall/CRC Press (2008)

    Google Scholar 

  15. Tang, W., Xiong, H., Zhong, S., Wu, J.: Enhancing semi-supervised clustering: A feature projection perspective. In: Proc. of KDD 2007, pp. 707–716 (2007)

    Google Scholar 

  16. von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4), 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  17. Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: In ICML 2001, pp. 577–584 (2001)

    Google Scholar 

  18. Xing, E.P., Ng, A.Y., Jordan, M.I., Russell, S.: Distance metric learning, with application to clustering with side-information. In: NIPS, vol. 15, pp. 505–512 (2003)

    Google Scholar 

  19. Yoshida, T.: A graph model for clustering based on mutual information. In: Proceedings of PRICAI 2010 (2010) (accepted)

    Google Scholar 

  20. Yoshida, T., Okatani, K.: A Graph-based projection approach for Semi-Supervised Clustering. In: Proceedings of PKAW 2010 (2010) (accepted)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yoshida, T. (2010). Performance Evaluation of Constraints in Graph-Based Semi-supervised Clustering. In: An, A., Lingras, P., Petty, S., Huang, R. (eds) Active Media Technology. AMT 2010. Lecture Notes in Computer Science, vol 6335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15470-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15470-6_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15469-0

  • Online ISBN: 978-3-642-15470-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics