Abstract
This paper deals with the local learning approach for clustering, which is based on the idea that in a good clustering, the cluster label of each data point can be well predicted based on its neighbors and their cluster labels. We propose a novel local learning based clustering algorithm using kernel regression as the local label predictor. Although sum of absolute error is used instead of sum of squared error, we still obtain an algorithm that clusters the data by exploiting the eigen-structure of a sparse matrix. Experimental results on many data sets demonstrate the effectiveness and potential of the proposed method.
Chapter PDF
Similar content being viewed by others
References
Benedetti, J.K.: On the Nonparametric Estimation of Regression Functions. Journal of the Royal Statistical Society 39(2), 248–253 (1977)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
Bottou, L., Vapnik., V.: Local learning algorithms. Neural Computation 4(6), 888–900 (1992)
Boyd, S., Vandenberghe, V.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Brito, M., Chavez, E., Quiroz, A., Yukich, J.: Connectivity of the Mutual K-nearest-neighbor Graph in Clustering and Outlier Detection. Statistics and Probability Letters 35(1), 33–42 (1997)
Chan, P.K., Schlag, M.D.F., Zien, J.Y.: Spectral K-way Ratio-cut Partitioning and Clustering. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 13(9), 1088–1096 (1994)
Ding, C., He, X., Zha, H., Gu, M., Simon, H.D.: A Min-max Cut Algorithm for Graph Partitioning and Data Clustering. In: Proceedings of the 2001 IEEE International Conference on Data Mining (2001)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)
Lewis, D.D.: Reuters-21578 text categorization test collection, http://www.daviddlewis.com/resources/testcollections/reuters21578/
Nadaraya, E.A.: On Estimating Regression. Theory of Probability and its Applications 9(1), 141–142 (1964)
Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)
Strehl, A., Ghosh, J.: Cluster Ensembles — A Knowledge Reuse Framework for Combining Multiple Partitions. Journal of Machine Learning Research 3, 583–617 (2002)
Takeda, H., Farsiu, S., Milanfar, P.: Kernel Regression for Image Processing and Reconstruction. IEEE Transactions on Image Processing 16(2), 349–366 (2007)
Tang, W., Xiong, H., Zhong, S., Wu, J.: Enhancing Semi-Supervised Clustering: A Feature Projection Perspective. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2007)
TREC: Text REtrieval Conference, http://trec.nist.gov
Wang, F., Zhang, C., Li, T.: Clustering with Local and Global Regularization. In: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (2007)
Wang, F., Zhang, C., Li, T.: Regularized Clustering for Documents. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2007)
Weinberger, K.Q., Tesauro, G.: Metric Learning for Kernel Regression. In: Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics (2007)
Wu, M., Schölkopf, B.: A Local Learning Approach for Clustering. In: Advances in Neural Information Processing Systems 19 (2006)
Wu, M., Schölkopf, B.: Transductive Classification via Local Learning Regularization. In: Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics (2007)
Wu, M., Yu, K., Yu, S., Schölkopf, B.: Local Learning Projections. In: Proceedings of the Twenty-Fourth International Conference on Machine Learning (2007)
Yu, S.X., Shi, J.: Multiclass Spectral Clustering. In: Proceedings of the 9th International Conference On Computer Vision (2003)
Zha, H., He, X., Ding, C., Gu, M., Simon, H.D.: Spectral Relaxation for K-means Clustering. In: Advances in Neural Information Processing Systems 14 (2001)
Zhao, Y., Karypis, G.: Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering. Machine Learning 55, 311–331 (2004)
Zhong, S., Ghosh, J.: A Unified Framework for Model-based Clustering. Journal of Machine Learning Research 4, 1001–1037 (2003)
Zhu, X., Goldberg, A.: Kernel Regression with Order Preferences. In: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sun, J., Shen, Z., Li, H., Shen, Y. (2008). Clustering Via Local Regression. In: Daelemans, W., Goethals, B., Morik, K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2008. Lecture Notes in Computer Science(), vol 5212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87481-2_30
Download citation
DOI: https://doi.org/10.1007/978-3-540-87481-2_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87480-5
Online ISBN: 978-3-540-87481-2
eBook Packages: Computer ScienceComputer Science (R0)