Abstract
Semi-supervised learning has been attracting much interest to cope with vast amount of data. When similarities among instances are specified, by connecting each pair of instances with an edge, the entire data can be represented as an edge-weighted graph. Based on the graph representation, we have proposed a graph-based approach for semi-supervised clustering, which modifies the graph structure by contraction in graph theory and graph Laplacian in spectral graph theory. In this paper we conduct extensive experiments over various document datasets and report its performance evaluation, with respect to the type of constraints as well as the number of constraints. We also compare it with other state of the art methods in terms of accuracy and running time, and the results are encouraging. Especially, our approach can leverage small amount of pairwise constraints to increase the performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Basu, S., Bilenko, M., Mooney, R.J.: A probabilistic framework for semi-supervised clustering. In: KDD 2004, pp. 59–68 (2004)
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with to-training. In: Proceedings of 11th Computational Learning Theory, pp. 92–100 (1998)
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge (2006)
Chung, F.: Spectral Graph Theory. American Mathematical Society, Providence (1997)
Cover, T., Thomas, J.: Elements of Information Theory. Wiley, Chichester (2006)
Dhillon, J., Mallela, S., Modha, D.: Information-theoretic co-clustering. In: Proc. of KDD 2003, pp. 89–98 (2003)
Diestel, R.: Graph Theory. Springer, Heidelberg (2006)
Girolami, M.: Mercer kernel-based clustering in feature space. IEEE Transactions on Neural Networks 13(3), 780–784 (2002)
Goldman, S., Zhou, Y.: Enhancing supervised learning with unlabeled data. In: Proceedings of ICML 2000, pp. 327–334 (2000)
Guënoche, A., Hansen, P., Jaumard, B.: Efficient algorithms for divisive hierarchical clustering with the diameter criterion. J. of Classification 8, 5–30 (1991)
Li, Z., Liu, J., Tang, X.: Pairwise constraint propagation by semidefinite programming for semi-supervised classification. In: ICML 2008, pp. 576–583 (2008)
Ogino, H., Yoshida, T.: Toward improving re-coloring based clustering with graph b-coloring. In: Proceedings of PRICAI 2010 (2010) (accepted)
Strehl, A., Ghosh, J.: Cluster Ensembles -A Knowledge Reuse Framework for Combining Multiple Partitions. J. Machine Learning Research 3(3), 583–617 (2002)
Sugato Basu, I.D., Wagstaff, K. (eds.): Constrained Clustering: Advances in Algorithms, Theory, and Applications. Chapman & Hall/CRC Press (2008)
Tang, W., Xiong, H., Zhong, S., Wu, J.: Enhancing semi-supervised clustering: A feature projection perspective. In: Proc. of KDD 2007, pp. 707–716 (2007)
von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4), 395–416 (2007)
Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: In ICML 2001, pp. 577–584 (2001)
Xing, E.P., Ng, A.Y., Jordan, M.I., Russell, S.: Distance metric learning, with application to clustering with side-information. In: NIPS, vol. 15, pp. 505–512 (2003)
Yoshida, T.: A graph model for clustering based on mutual information. In: Proceedings of PRICAI 2010 (2010) (accepted)
Yoshida, T., Okatani, K.: A Graph-based projection approach for Semi-Supervised Clustering. In: Proceedings of PKAW 2010 (2010) (accepted)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yoshida, T. (2010). Performance Evaluation of Constraints in Graph-Based Semi-supervised Clustering. In: An, A., Lingras, P., Petty, S., Huang, R. (eds) Active Media Technology. AMT 2010. Lecture Notes in Computer Science, vol 6335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15470-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-15470-6_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15469-0
Online ISBN: 978-3-642-15470-6
eBook Packages: Computer ScienceComputer Science (R0)