Abstract
Semi-supervised clustering is an attempt to reconcile clustering (unsupervised learning) and classification (supervised learning, using prior information on the data). These two modes of data analysis are combined in a parameterized model, the parameter θ ∈ [0, 1] is the weight attributed to the prior information, θ = 0 corresponding to clustering, and θ = 1 to classification. The results (cluster centers, classification rule) depend on the parameter θ, an insensitivity to θ indicates that the prior information is in agreement with the intrinsic cluster structure, and is otherwise redundant. This explains why some data sets (such as the Wisconsin breast cancer data, Merz and Murphy, UCI repository of machine learning databases, University of California, Irvine, CA) give good results for all reasonable classification methods. The uncertainty of classification is represented here by the geometric mean of the membership probabilities, shown to be an entropic distance related to the Kullback–Leibler divergence.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aczél, J. (1984). Measuring information beyond communication theory – Why some generalized information measures may be useful, others not. Aequationes Mathematicae, 27, 1–19.
Arav, M. (2008). Contour approximation of data and the harmonic mean. Journal of Mathematical Inequalities, 2, 161–167.
Bar-Hillel, A., Hertz, T., Shental, N., & Weinshall, D. (2005). Learning a mahalanobis metric from equivalence constraints. Journal of Machine Learning Research, 6, 937–965.
Ben-Israel, A., & Iyigun, C. (2008). Probabilistic distance clustering. Journal of Classification, 25, 5–26.
Ben-Tal, A., Ben-Israel, A., & Teboulle, M. (1991). Certainty equivalents and information measures: Duality and extremal principles. Journal of Mathematical Analysis and Applications, 157, 211–236.
Ben-Tal, A., & Teboulle, M. (1987). Penalty functions and duality in stochastic programming via ϕ-divergence functionals. Mathematics of Operations Research, 12, 224–240.
Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms. New York: Plenum.
Chapelle, O., Schölkopf, B., & Zien, A. (Eds.) (2006). Semi-supervised learning. Cambridge MA: MIT Press.
Csiszár, I. (1978). Information measures: A critical survey. In Trans. 7th Prague Conf. on Info. Th., Statist., Decis. Funct., Random Processes and 8th European Meeting of Statist. (Vol. B, pp. 73–86). Prague: Academia.
Dixon, K. R., & Chapman, J. A. (1980). Harmonic mean measure of animal activity areas. Ecology, 61, 1040–1044.
Grira, N., Crucianu, M., & Boujemaa, N. (2005). Unsupervised and semi-supervised clustering: A brief survey. In A Review of Machine Learning Techniques for Processing Multimedia Content. Report of the MUSCLE European Network of Excellence.
Höppner, F., Klawonn, F., Kruse, R., & Runkler, T. (1999). Fuzzy cluster analysis. New York: Wiley.
Iyigun, C., & Ben-Israel, A. (2008). Probabilistic distance clustering adjusted for cluster size. Probability in the Engineering and Informational Sciences, 22, 1–19.
Iyigun, C., & Ben-Israel, A. (2009). Contour approximation of data: The dual problem. Linear Algebra and Its Applications, 430, 2771–2780.
Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: A review. ACM Computing Surveys, 31, 264–323.
Kuhn, H. W. (1967). On a pair of dual nonlinear programs. In J. Abadie (Ed.), Methods of nonlinear programming (pp. 38–54). Amsterdam: North-Holland.
Kuhn, H. W. (1973). A note on Fermat’s problem. Mathematical Programming, 4, 98–107.
Kullback, S. (1959). Information theory and statistics. New York: Wiley.
Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22, 79–86.
Lim, T.-S., Loh, W.-Y., & Shih, Y.-S. (2000). A comparison of prediction accuracy, complexity, and training time of thirty three old and new classification algorithms. Machine Learning, 40, 203–228.
Luce, R. D. (1959). Individual choice behavior. New York: Wiley.
Mangasarian, O. L., Setiono, R., & Wolberg, W. H. (1999). Pattern recognition via linear programming: theory and application to medical diagnosis. In T. Coleman, & Y. Li (Eds.) Large-scale numerical optimization (pp. 22–30). Philadelphia: SIAM Publications.
Merz, C., & Murphy, P. (1996). UCI repository of machine learning databases. Irvine, CA: Department of Information and Computer Science, University of California. Retrieved from http://www.ics.uci.edu/mlearn/MLRepository.html.
Teboulle, M. (2007). A unified continuous optimization framework for center-based clustering methods. Journal of Machine Learning Research, 8, 65–102.
Weiszfeld, E. (1937). Sur le point par lequel la somme des distances de n points donnés est minimum. Tohoku Mathematical Journal, 43, 355–386.
Wolberg, W. H., & Mangasarian, O. L. (1990). Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proceedings of the National Academy of Sciences of USA 87, 9193–9196.
Xing, E. P., Ng, A. Y., Jordan, M. I., & Russell, S. (2003). Distance metrice learning with application to clustering with side-information. In Advances in neural information processing systems (Vol. 15). Cambridge MA: MIT Press.
Yellott, J. I. Jr. (2001). Luce’s Choice Axiom. In N. J. Smelser, & P. B. Baltes (Eds.), International Encyclopedia of the Social and Behavioral Sciences (pp. 9094–9097). Oxford: Elsevier. ISBN 0-08-043076-7.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Iyigun, C., Ben-Israel, A. (2009). Semi-supervised Probabilistic Distance Clustering and the Uncertainty of Classification. In: Fink, A., Lausen, B., Seidel, W., Ultsch, A. (eds) Advances in Data Analysis, Data Handling and Business Intelligence. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01044-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-01044-6_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01043-9
Online ISBN: 978-3-642-01044-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)