Clustering Algorithms

Bandyopadhyay, Sanghamitra; Saha, Sriparna

doi:10.1007/978-3-642-32451-2_4

Sanghamitra Bandyopadhyay³ &
Sriparna Saha⁴

2818 Accesses
1 Citations

Abstract

This chapter consists of detailed discussions regarding the clustering problem. Different well-known partitional clustering techniques like K-means, K-medoid, and fuzzy C-means are described. This is followed by a discussion on some distribution-based clustering techniques, namely expectation maximization. Hierarchical clustering techniques, like single linkage, average linkage and complete linkage, and density-based clustering techniques, like DB-Scan and GD-Scan, are then described in detail. Some grid-based clustering techniques, e.g., STRING are discussed next. The problem of clustering is thereafter formulated as one of optimization and some evolutionary clustering techniques are described. Finally it is shown how clustering can be posed as a multiobjective optimization problem and some recently developed multiobjective clustering techniques are described in brief.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 69.95; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Anderberg, M.R.: Cluster Analysis for Application. Academic Press, New York (1973)
Google Scholar
Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J.: OPTICS: Ordering points to identify the clustering structure. In: Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD’99), pp. 49–60. ACM, Philadelphia (1999)
Google Scholar
Asharaf, S., Murty, M.N.: An adaptive rough fuzzy single pass algorithm for clustering large data sets. Pattern Recogn. 36(12), 3015–3018 (2003)
Article MATH Google Scholar
Asharaf, S., Murty, M.N.: Scalable non-linear support vector machine using hierarchical clustering. In: ICPR (1), pp. 908–911 (2006)
Google Scholar
Asharaf, S., Murty, M.N., Shevade, S.K.: Cluster based core vector machine. In: ICDM, pp. 1038–1042 (2006)
Google Scholar
Asharaf, S., Shevade, S.K., Murty, M.N.: Rough support vector clustering. Pattern Recogn. 38(10), 1779–1783 (2005)
Article MATH Google Scholar
Attneave, F.: Symmetry information and memory for pattern. Am. J. Psychol. 68, 209–222 (1995)
Article Google Scholar
Babu, G.P., Murty, M.N.: A near-optimal initial seed value selection in K-means algorithm using a genetic algorithm. Pattern Recognit. Lett. 14(10), 763–769 (1993)
Article MATH Google Scholar
Bandyopadhyay, S.: An automatic shape independent clustering technique. Pattern Recogn. 37(1), 33–45 (2004)
Article Google Scholar
Bandyopadhyay, S.: Genetic algorithms for clustering and fuzzy clustering. WIREs Data Min. Knowl. Discov. 1(6), 524–531 (2011)
Article Google Scholar
Bandyopadhyay, S., Maulik, U.: Non-parametric genetic clustering: Comparison of validity indices. IEEE Trans. Syst. Man Cybern., Part C, Appl. Rev. 31(1), 120–125 (2001)
Article Google Scholar
Bandyopadhyay, S., Maulik, U.: An evolutionary technique based on K-means algorithm for optimal clustering in R ^N. Inf. Sci. 146(1–4), 221–237 (2002)
Article MathSciNet MATH Google Scholar
Bandyopadhyay, S., Maulik, U.: Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recognit. 35(6), 1197–1208 (2002)
Article MATH Google Scholar
Bandyopadhyay, S., Maulik, U., Mukhopadhyay, A.: Multiobjective genetic clustering for pixel classification in remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 45(5), 1506–1511 (2007)
Article Google Scholar
Bandyopadhyay, S., Mukhopadhyay, A., Maulik, U.: An improved algorithm for clustering gene expression data. Bioinformatics 23(21), 2859–2865 (2007)
Article Google Scholar
Bandyopadhyay, S., Pal, S.K.: Classification and Learning Using Genetic Algorithms Applications in Bioinformatics and Web Intelligence. Springer, Heidelberg (2007)
MATH Google Scholar
Bandyopadhyay, S., Saha, S.: GAPS: A clustering method using a new point symmetry based distance measure. Pattern Recognit. 40(12), 3430–3451 (2007)
Article MATH Google Scholar
Bandyopadhyay, S., Saha, S.: A point symmetry based clustering technique for automatic evolution of clusters. IEEE Trans. Knowl. Data Eng. 20(11), 1–17 (2008)
Article Google Scholar
Bargiela, A., Pedrycz, W., Hirota, K.: Granular prototyping in fuzzy clustering. IEEE Trans. Fuzzy Syst. 12(5), 697–709 (2004)
Article Google Scholar
Beliakov, G., King, M.: Density based fuzzy c-means clustering of non-convex patterns. Eur. J. Oper. Res. 173, 717–728 (2006)
Article MathSciNet MATH Google Scholar
Berg, M.D., Kreveld, M.V., Overmars, M., Schwarzkopf, O.: Computational Geometry: Algorithms and Applications. Springer, Heidelberg (2008)
MATH Google Scholar
Bezdek, J.C.: Fuzzy mathematics in pattern classification. Ph.D. thesis, Cornell University, Ithaca, NY (1973)
Google Scholar
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum, New York (1981)
Book MATH Google Scholar
Bezdek, J.C., Pal, N.R.: Some new indexes of cluster validity. IEEE Trans. Syst. Man Cybern. 28(3), 301–315 (1998)
Article Google Scholar
Bhuyan, J.N., Raghavan, V.V., Elayavalli, V.K.: Genetic algorithm for clustering with an ordered representation. In: Proc. Int. Conf. on Genetic Algorithm ’91, pp. 408–415. Morgan Kaufmann, San Mateo (1991)
Google Scholar
Bouchachia, A., Pedrycz, W.: Data clustering with partial supervision. Data Min. Knowl. Discov. 12(1), 47–78 (2006)
Article MathSciNet Google Scholar
Bouchachia, A., Pedrycz, W.: Enhancement of fuzzy clustering by mechanisms of partial supervision. Fuzzy Sets Syst. 157(13), 1733–1759 (2006)
Article MathSciNet MATH Google Scholar
Bradley, P.S., Fayyad, U.M., Reina, C.: Scaling clustering algorithms to large databases. In: Proc. Fourth International Conference on Knowledge Discovery and Data Mining, pp. 9–15 (1998)
Google Scholar
Calinski, R.B., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat., Theory Methods 3(1), 1–27 (1974)
MathSciNet MATH Google Scholar
Carpenter, G., Grossberg, S.: A massively parallel architecture for a self-organizing neural pattern recognition machine. Comput. Vis. Graph. Image Process. 37(3), 54–115 (1987)
Article MATH Google Scholar
Carpenter, G., Grossberg, S.: ART2: Self-organization of stable category recognition codes for analog input patterns. Appl. Opt. 26(23), 4919–4930 (1987)
Article Google Scholar
Charalampidis, D.: A modified K-means algorithm for circular invariant clustering. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1856–1865 (2005)
Article Google Scholar
Chen, Y.L., Hu, H.L.: An overlapping cluster algorithm to provide non-exhaustive clustering. Eur. J. Oper. Res. 173, 762–780 (2006)
Article MathSciNet MATH Google Scholar
Choi, J.N., Oh, S.K., Pedrycz, W.: Structural and parametric design of fuzzy inference systems using hierarchical fair competition-based parallel genetic algorithms and information granulation. Int. J. Approx. Reason. 49(3), 631–648 (2008)
Article Google Scholar
Chou, C.H., Su, M.C., Lai, E.: Symmetry as a new measure for cluster validity. In: 2nd WSEAS Int. Conf. on Scientific Computation and Soft Computing, Crete, Greece, pp. 209–213 (2002)
Google Scholar
Chung, K.L., Lin, J.S.: Faster and more robust point symmetry-based K-means algorithm. Pattern Recognit. 40(2), 410–422 (2007)
Article MathSciNet MATH Google Scholar
Chung, K.L., Lin, K.S.: An efficient line symmetry-based K-means algorithm. Pattern Recognit. Lett. 27(7), 765–772 (2006)
Article Google Scholar
Cole, R.M.: Clustering with genetic algorithms. Master’s thesis, Department of Computer Science, University of Western Australia, Australia (1998)
Google Scholar
Cowgill, M.C., Harvey, R.J., Watson, L.T.: A genetic algorithm approach to cluster analysis. Comput. Math. Appl. 37(7), 99–108 (1999)
Article MathSciNet MATH Google Scholar
Dave, R.N.: Use of the adaptive fuzzy clustering algorithm to detect lines in digital images. Intell. Robots Comput. Vis. VIII 1192, 600–611 (1989)
Google Scholar
Dave, R.N., Bhaswan, K.: Adaptive fuzzy c-shells clustering and detection of ellipses. IEEE Trans. Neural Netw. 3(5), 643–662 (1992)
Article Google Scholar
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1(4), 224–227 (1979)
Article Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc., Ser. B 39(1), 1–38 (1977)
MathSciNet MATH Google Scholar
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, New York (1973)
MATH Google Scholar
Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3), 32–57 (1973)
Article MathSciNet MATH Google Scholar
Ester, M., Kriegal, H.P., Sander, J.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. of the Second International Conference on Knowledge Discovery and Data-mining, pp. 226–231. AAAI Press, Menlo Park (1996)
Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: Density-based algorithm for discovering clusters in large spatial databases. In: Proceedings of the Second International Conference on Data Mining (KDD’96), Portland, OR, pp. 226–231 (1996)
Google Scholar
Estivill-Castro, V., Murray, A.T.: Spatial clustering for data mining with genetic algorithms. In: Proceedings of the International ICSC Symposium on Engineering of Intelligent Systems, pp. 317–323 (1997)
Google Scholar
Everitt, B.S., Landau, S., Leese, M.: Cluster Analysis. Arnold, London (2001)
MATH Google Scholar
Falkenauer, E.: Genetic Algorithms and Grouping Problems. Wiley, New York (1998)
Google Scholar
Fränti, P., Kivijärvi, J., Kaukoranta, T., Nevalainen, O.: Genetic algorithms for large scale clustering problems. Comput. J. 40, 547–554 (1997)
Article Google Scholar
Gan, G., Ma, C., Wu, J.: Data Clustering – Theory, Algorithms, and Applications. SIAM, Philadelphia (2007)
Book MATH Google Scholar
Gath, I., Geva, A.B.: Unsupervised optimal fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 11(7), 773–781 (1989)
Article Google Scholar
Grabusts, P., Borisov, A.: Using grid-clustering methods in data classification. In: 2002 International Conference on Parallel Computing in Electrical Engineering (PARELEC 2002), 22–25 September 2002, Warsaw, Poland, pp. 425–426. IEEE Comput. Soc., Los Alamitos (2002)
Chapter Google Scholar
Gustafson, D.E., Kessel, W.C.: Fuzzy clustering with a fuzzy covariance matrix. Proc. IEEE Conf. Decision Contr. 17, 761–766 (1979)
Google Scholar
Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering. IEEE Trans. Evol. Comput. 11(1), 56–76 (2007)
Article Google Scholar
Hansen, P., Mladenovic, N.: J-means: A new local search heuristic for minimum sum of squares clustering. Pattern Recognit. 34(2), 405–413 (2001)
Article MATH Google Scholar
Hartuv, E., Shamir, R.: A clustering algorithm based on graph connectivity. Inf. Process. Lett. 76, 175–181 (2000). http://dl.acm.org/citation.cfm?id=364456.364469
Article MathSciNet MATH Google Scholar
Hartwig, F., Dearing, B.: Exploratory Data Analysis. Sage, Thousand Oaks (1979)
Google Scholar
Hertz, J., Krogh, A., Palmer, R.G.: Introduction to the Theory of Neural Computation. Addison-Wesley, Reading (1991)
Google Scholar
Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)
Google Scholar
Höppner, F.: Fuzzy shell clustering algorithms in image processing: Fuzzy c-rectangular and 2-rectangular shells. IEEE Trans. Fuzzy Syst. 5(4), 599–613 (1997)
Article Google Scholar
Hruschka, E.R., Campello, R.J.G.B., de Castro, L.N.: Evolutionary algorithms for clustering gene-expression data. In: Proc. 4th IEEE Int. Conference on Data Mining, pp. 403–406 (2004)
Chapter Google Scholar
Hruschka, E.R., Campello, R.J.G.B., de Castro, L.N.: Improving the efficiency of a clustering genetic algorithm. In: Proc. 9th Ibero-American Conference on Artificial Intelligence, Lecture Notes in Computer Science, vol. 3315, pp. 861–870 (2004)
Google Scholar
Hruschka, E.R., Campello, R.J.G.B., de Castro, L.N.: Evolving clusters in gene-expression data. Inf. Sci. 176(13), 1898–1927 (2006)
Article Google Scholar
Hruschka, E.R., Campello, R.J.G.B., Freitas, A.A., de Carvalho, A.C.P.L.F.: A survey of evolutionary algorithms for clustering. IEEE Trans. Syst. Man Cybern., Part C, Appl. Rev. 39(2), 133–155 (2009)
Article Google Scholar
Hruschka, E.R., Ebecken, N.F.F.: A genetic algorithm for cluster analysis. Intell. Data Anal. 7(1), 15–25 (2003)
Google Scholar
Hu, M.K.: Visual pattern recognition by moment invariants. IEEE Trans. Inf. Theory 8(2), 179–187 (1962)
Article MATH Google Scholar
Hughes, E.J.: Evolutionary many-objective optimization: Many once or one many. In: Proceedings of 2005 Congress on Evolutionary Computation, Edinburgh, Scotland, UK, September 2–5, 2005, pp. 222–227 (2005)
Chapter Google Scholar
Ingber, L.: Very fast simulated re-annealing. Math. Comput. Model. 12(8), 967–973 (1989)
Article MathSciNet MATH Google Scholar
Ishibuchi, H., Doi, T., Nojima, Y.: Incorporation of scalarizing fitness functions into evolutionary multiobjective optimization algorithms. In: Parallel Problem Solving from Nature IX (PPSN-IX), vol. 4193, pp. 493–502 (2006)
Chapter Google Scholar
Ishibuchi, H., Murata, T.: A multi-objective genetic local search algorithm and its application to flowshop scheduling. IEEE Trans. Syst. Man Cybern., Part C, Appl. Rev. 28(3), 392–403 (1998)
Article Google Scholar
Ishibuchi, H., Yoshida, T., Murata, T.: Balance between genetic search and local search in memetic algorithms for multiobjective permutation flowshop scheduling. IEEE Trans. Evol. Comput. 6(6), 721–741 (1984)
Google Scholar
Jaccard, P.: Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull. Soc. Vaud. Sci. Nat. 37, 547–579 (1901)
Google Scholar
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)
MATH Google Scholar
Jain, A.K., Duin, P., Jianchang, M.: Statistical pattern recognition: A review. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 4–37 (2000)
Article Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Comput. Surv. 31(3), 264–323 (1999)
Article Google Scholar
Jardine, N., Sibson, R.: Mathematical Taxonomy. Wiley, New York (1971)
MATH Google Scholar
Jaszkiewicz, A.: Comparison of local search-based metaheuristics on the multiple objective knapsack problem. Found. Comput. Dec. Sci. 26(1), 99–120 (2001)
MathSciNet Google Scholar
Jolliffe, I.: Principal Component Analysis. Springer Series in Statistics. Springer, England (1986)
Google Scholar
Kandel, A.: Fuzzy Techniques in Pattern Recognition. Wiley-Interscience, New York (1982)
MATH Google Scholar
Kandel, A.: Fuzzy Mathematical Techniques with Applications. Addison-Wesley, New York (1986)
MATH Google Scholar
Kankanala, L., Murty, M.N.: Hybrid approaches for clustering. In: PReMI, pp. 25–32 (2007)
Google Scholar
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An efficient K-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2002)
Article Google Scholar
Kaufman, L., Rosseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)
Book Google Scholar
Kim, D.J., Park, Y.W., Park, D.J.: A novel validity index for determination of the optimal number of clusters. IEICE Trans. Inf. Syst. D-E84(2), 281–285 (2001)
Google Scholar
Kim, D.W., Lee, K.H., Lee, D.: Fuzzy cluster validation index based on inter-cluster proximity. Pattern Recognit. Lett. 24(15), 2561–2574 (2003)
Article Google Scholar
Kim, T.H., Barrera, L.O., Zheng, M., Qu, C., Singer, M.A., Richmond, T.A., Wu, Y., Green, R.D., Ren, B.: A high-resolution map of active promoters in the human genome. Nature 436, 876–880 (2005)
Article Google Scholar
Kim, Y.I., Kim, D.W., Lee, D., Lee, K.H.: A cluster validation index for GK cluster analysis based on relative degree of sharing. Inf. Sci. 168(1–4), 225–242 (2004)
Article MATH Google Scholar
Kirkpatrick, S.: Optimization by simulated annealing: Quantitative studies. J. Stat. Phys. 34(5/6), 975–986 (1984)
Article MathSciNet Google Scholar
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983)
Article MathSciNet MATH Google Scholar
Kirpatrick, S., Vecchi, M.P.: Global wiring by simulated annealing. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. CAD-2(4), 215–222 (1983)
Google Scholar
Knowles, J.D., Corne, D.W.: Approximating the nondominated front using the Pareto archived evolution strategy. Evol. Comput. 8(2), 149–172 (2000)
Article Google Scholar
Kohonen, T.: The ‘neural’ phonetic typewriter. IEEE Comput. 27(3), 11–12 (1988)
Article Google Scholar
Kohonen, T.: Self-Organization and Associative Memory, 3rd edn. Springer, New York (1989)
Book Google Scholar
Konak, A., Coit, D., Smith, A.: Multi-objective optimization using genetic algorithms: A tutorial. Reliab. Eng. Syst. Saf. 91(9), 992–1007 (2006). http://linkinghub.elsevier.com/retrieve/pii/S0951832005002012
Article Google Scholar
Korkmaz, E.E., Du, J., Alhajj, R., Barker, K.: Combining advantages of new chromosome representation scheme and multi-objective genetic algorithms for better clustering. Intell. Data Anal. 10(2), 163–182 (2006)
Google Scholar
Kövesi, B., Boucher, J.M., Saoodi, S.: Stochastic K-means algorithm for vector quantization. Pattern Recognit. Lett. 22(6–7), 603–610 (2001)
Article MATH Google Scholar
Krishna, K., Murty, M.N.: Genetic K-means algorithm. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 29(3), 433–439 (1999)
Article Google Scholar
Krishnapuram, R., Nasraoui, O., Frigui, H.: The fuzzy c-spherical shells algorithm: A new approach. IEEE Trans. Neural Netw. 3(5), 663–671 (1992)
Article Google Scholar
Krovi, R.: Genetic algorithms for clustering: A preliminary investigation. In: Proceedings of the 25th Hawaii Int. Conference on System Sciences, vol. 4, pp. 540–544 (1992)
Google Scholar
Kuncheva, L.I., Bezdek, J.C.: Selection of cluster prototypes from data by a genetic algorithm. In: Proceedings of the 5th European Congress on Intelligent Techniques and Soft Computing, pp. 1683–1688 (1997)
Google Scholar
Kwanghoon, S., Jung, K.H., Alexander, W.E.: A mean field annealing approach to robust corner detection. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 28(1), 82–90 (1998)
Article Google Scholar
Kwon, S.H.: Cluster validity index for fuzzy clustering. Electron. Lett. 34(22), 2176–2177 (1998)
Article Google Scholar
Lance, G., Williams, W.: Mixed-data classificatory programs. I. Agglomerative systems. Aust. Comput. J. 1, 15–20 (1967)
Google Scholar
Lange, T., Roth, V., Braun, M.L., Buhmann, J.M.: Stability-based validation of clustering solutions. Neural Comput. 16(6), 1299–1323 (2004)
Article MATH Google Scholar
Laszlo, M., Mukherjee, S.: A genetic algorithm using hyper-quadtrees for low-dimensional K-means clustering. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 533–543 (2006)
Article Google Scholar
Leon, E., Nasraoui, O., Gomez, J.: ECSAGO: Evolutionary clustering with self adaptive genetic operators. In: Proc. IEEE Congress on Evolutionary Computation, July 16–21, 2006, pp. 1768–1775 (2006)
Google Scholar
Likas, A., Vlassis, N., Verbeek, J.J.: The global k-means clustering algorithm. Pattern Recognit. 36, 451–461 (2003)
Article Google Scholar
Lin, J.Y., Peng, H., Xie, J.M., Zheng, Q.L.: Novel clustering algorithm based on central symmetry. In: Proceedings of 2004 International Conference on Machine Learning and Cybernetics, 26–29 August 2004, vol. 3, pp. 1329–1334 (2004)
Google Scholar
Loia, V., Pedrycz, W., Senatore, S.: Semantic web content analysis: A study in proximity-based collaborative clustering. IEEE Trans. Fuzzy Syst. 15(6), 1294–1312 (2007)
Article Google Scholar
Lu, Y., Lu, S., Fotouhi, F., Deng, Y., Brown, S.J.: FGKA: A fast genetic K-means clustering algorithm. In: SAC ’04: Proceedings of the 2004 ACM Symposium on Applied Computing, pp. 622–623. ACM, New York (2004)
Chapter Google Scholar
Lu, Y., Lu, S., Fotouhi, F., Deng, Y., Brown, S.J.: Incremental genetic K-means algorithm and its application in gene expression data analysis. BMC Bioinform. 5, 172 (2004)
Article Google Scholar
Lucasius, C.B., Dane, A.D., Kateman, G.: On K-medoid clustering of large data sets with the aid of a genetic algorithm: Background, feasibility and comparison. Anal. Chim. Acta 282, 647–669 (1993)
Article Google Scholar
Ma, P.C.H., Chan, K.C.C., Yao, X., Chiu, D.K.Y.: An evolutionary clustering algorithm for gene expression microarray data analysis. IEEE Trans. Evol. Comput. 10(3), 296–314 (2006)
Article Google Scholar
Man, Y., Gath, I.: Detection and separation of ring-shaped clusters using fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 16(8), 855–861 (1994)
Article Google Scholar
Mao, J., Jain, A.K.: A self-organizing network for hyperellipsoidal clustering. IEEE Trans. Neural Netw. 7(1), 16–29 (1996)
Article Google Scholar
Marden, J.I.: Analyzing and Modeling Rank Data. Chapman & Hall, London (1995)
MATH Google Scholar
Matake, N., Hiroyasu, T., Miki, M., Senda, T.: Multiobjective clustering with automatic k-determination for large-scale data. In: GECCO ’07: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, pp. 861–868. ACM, New York (2007)
Chapter Google Scholar
Maulik, U., Bandyopadhyay, S.: Genetic algorithm based clustering technique. Pattern Recognit. 33(9), 1455–1465 (2000)
Article Google Scholar
Maulik, U., Bandyopadhyay, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Pattern Anal. Mach. Intell. 24(12), 1650–1654 (2002)
Article Google Scholar
Maulik, U., Bandyopadhyay, S.: Fuzzy partitioning using a real-coded variable-length genetic algorithm for pixel classification. IEEE Trans. Geosci. Remote Sens. 41(5), 1075–1081 (2003)
Article Google Scholar
Maulik, U., Bandyopadhyay, S., Trinder, J.: SAFE: An efficient feature extraction technique. J. Knowl. Inf. Syst. 3(3), 374–387 (2001)
Article MATH Google Scholar
Maulik, U., Mukhopadhyay, A., Bandyopadhyay, S.: Combining Pareto-optimal clusters using supervised learning for identifying co-expressed genes. BMC Bioinform. 27, 1197–1208 (2009)
Google Scholar
Maulik, U., Bandyopadhyay, S., Mukhopadhyay, A.: Multiobjective Genetic Algorithms for Clustering – Applications in Data Mining and Bioinformatics. Springer, Heidelberg (2011)
Book MATH Google Scholar
Merz, P., Zell, A.: Clustering gene expression profiles with memetic algorithms. In: PPSN VII: Proceedings of the 7th International Conference on Parallel Problem Solving from Nature, pp. 811–820. Springer, London (2002)
Chapter Google Scholar
Metropolis, N., Rosenbluth, A.W., Rosenbloth, M.N., Teller, A.H., Teller, E.: Equation of state calculation by fast computing machines. J. Chem. Phys. 21(6), 1087–1092 (1953)
Article Google Scholar
Mezzich, J.E.: Evaluating clustering methods for psychiatric-diagnosis. Biol. Psychiatry 13, 265–281 (1978)
Google Scholar
Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs. Springer, New York (1992)
MATH Google Scholar
Mikheev, A., Vincent, L., Faber, V.: High-quality polygonal contour approximation based on relaxation. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, p. 361. IEEE Comput. Soc., Washington (2001). http://dl.acm.org/citation.cfm?id=876867.877738
Chapter Google Scholar
Milligan, G.: An algorithm for generating artificial test clusters. Psychometrika 50(1), 123–127 (1981)
Article MathSciNet Google Scholar
Milligan, G.W., Cooper, C.: An examination of procedures for determining the number of clusters in a data set. Psychometrika 50(2), 159–179 (1985)
Article Google Scholar
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Mount, D.M., Arya, S.: ANN: A library for approximate nearest neighbor searching (2005). http://www.cs.umd.edu/~mount/ANN
Mukhopadhyay, A., Maulik, U.: Unsupervised pixel classification in satellite imagery using multiobjective fuzzy clustering combined with SVM classifier. IEEE Trans. Geosci. Remote Sens. 47(4), 1132–1138 (2009)
Article Google Scholar
Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S.: Multi-objective genetic algorithm based fuzzy clustering of categorical attributes. IEEE Trans. Evol. Comput. 13(5), 991–1005 (2009)
Article Google Scholar
Murthy, C.A., Chowdhury, N.: In search of optimal clusters using genetic algorithms. Pattern Recognit. Lett. 17(8), 825–832 (1996)
Article Google Scholar
Nam, D., Park, C.H.: Pareto-based cost simulated annealing for multiobjective optimization. In: Proceedings of the 4th Asia-Pacific Conference on Simulated Evolution and Learning (SEAL’02), vol. 2, pp. 522–526. Nanyang Technical University, Orchid Country Club, Singapore (2002)
Google Scholar
Nam, D.K., Park, C.H.: Multiobjective simulated annealing: A comparative study to evolutionary algorithms. Int. J. Fuzzy Syst. 2(2), 87–97 (2000)
Google Scholar
Ng, R., Han, J.: Efficient and effective clustering method for spatial data mining. In: Proceedings of the 1994 International Conference on Very Large Data Bases, Santiago, Chile, pp. 144–155 (1994)
Google Scholar
Ng, R., Han, J.: Knowledge discovery in large spatial databases: Focusing techniques for efficient class identification. In: Proceedings of the 4th International Symposium on Large Spatial Databases (SSD’95), Portland, ME, pp. 67–82 (1995)
Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, Heidelberg (2007)
Google Scholar
Nock, R., Nielsen, F.: On weighting clustering. IEEE Trans. Pattern Anal. Mach. Intell. 28(8), 1223–1235 (2006)
Article Google Scholar
Pakhira, M.K., Maulik, U., Bandyopadhyay, S.: Validity index for crisp and fuzzy clusters. Pattern Recognit. 37(3), 487–501 (2004)
Article MATH Google Scholar
Pal, P., Chanda, B.: A symmetry based clustering technique for multi-spectral satellite imagery. In: ICVGIP (2002)
Google Scholar
Pal, S.K.: Fuzzy set theoretic measures for automatic feature evaluation – II. Inf. Sci. 64, 165–179 (1992)
Article MATH Google Scholar
Pal, S.K., Majumder, D.D.: Fuzzy Mathematical Approach to Pattern Recognition. Wiley, New York (1986)
MATH Google Scholar
Pal, S.K., Mandal, D.P.: Linguistic recognition system based on approximate reasoning. Inf. Sci. 61, 135–161 (1992)
Article Google Scholar
Pal, S.K., Mitra, S.: Fuzzy versions of Kohonen’s net and MLP-based classification: Performance evaluation for certain nonconvex decision regions. Inf. Sci. 76, 297–337 (1994)
Article MATH Google Scholar
Park, Y.J., Song, M.S.: A genetic algorithm for clustering problems. In: Proc. 3rd Annual Conference on Genetic Programming, Paris, France, pp. 568–575 (1998)
Google Scholar
Pavlidis, T.: Structural Pattern Recognition. Springer, Berlin (1977)
MATH Google Scholar
Pedrycz, W.: A fuzzy cognitive structure for pattern recognition. Pattern Recognit. Lett. 9(5), 305–313 (1989)
Article MATH Google Scholar
Pedrycz, W.: Fuzzy sets in pattern recognition: Methodology and methods. Pattern Recognit. 23, 121–146 (1990)
Article Google Scholar
Pedrycz, W.: Fuzzy clustering with a knowledge-based guidance. Pattern Recognit. Lett. 25(4), 469–480 (2004)
Article MathSciNet Google Scholar
Pedrycz, W.: Knowledge-based clustering in computational intelligence. In: Challenges for Computational Intelligence, pp. 317–341. Springer, Heidelberg (2007)
Chapter Google Scholar
Pedrycz, W.: A dynamic data granulation through adjustable fuzzy clustering. Pattern Recognit. Lett. 29(16), 2059–2066 (2008)
Article Google Scholar
Pedrycz, W., Amato, A., Lecce, V.D., Piuri, V.: Fuzzy clustering with partial supervision in organization and classification of digital images. IEEE Trans. Fuzzy Syst. 16(4), 1008–1026 (2008)
Article Google Scholar
Pedrycz, W., Hirota, K.: A consensus-driven fuzzy clustering. Pattern Recognit. Lett. 29(9), 1333–1343 (2008)
Article Google Scholar
Pedrycz, W., Loia, V., Senatore, S.: P-FCM: A proximity-based fuzzy clustering. Fuzzy Sets Syst. 148(1), 21–41 (2004)
Article MathSciNet MATH Google Scholar
Pedrycz, W., Rai, P.: Collaborative clustering with the use of fuzzy c-means and its quantification. Fuzzy Sets Syst. 159(18), 2399–2427 (2008)
Article MathSciNet MATH Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Raftery, A.: A note on Bayes factors for log-linear contingency table models with vague prior information. J. R. Stat. Soc. 48(2), 249–250 (1986)
MathSciNet MATH Google Scholar
Rechenberg, I.: Evolutionsstrategie: Optimierung Technischer Systeme nach Prinzipien der Biologischen Evolution. Frommann-Holzboog, Stuttgart (1973)
Google Scholar
Richards, J.A.: Remote Sensing Digital Image Analysis: An Introduction. Springer, New York (1993)
Book Google Scholar
Ripon, K.S.N., Tsang, C.H., Kwong, S., Ip, M.K.: Multi-objective evolutionary clustering using variable-length real jumping genes genetic algorithm. In: ICPR’06: Proceedings of the 18th International Conference on Pattern Recognition, pp. 1200–1203. IEEE Comput. Soc., Washington (2006)
Google Scholar
Rousseeuw, P.: Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Article MATH Google Scholar
Ruck, D.W., Rogers, S.K., Kabrisky, M.: Feature selection using a multilayer perceptron. Network 2(2), 1–14 (1990). http://portal.acm.org.offcampus.lib.washington.edu/citation.cfm?id=1497653.1498412
Google Scholar
Rudolph, G.: Convergence analysis of canonical genetic algorithms. IEEE Trans. Neural Netw. 5(1), 96–101 (1994)
Article Google Scholar
Runyon, R., Haber, A.: Fundamentals of Behavioral Statistics. Addison-Wesley, Reading (1976)
Google Scholar
Saha, S., Bandyopadhyay, S.: A new line symmetry distance and its application to data clustering. J. Comput. Sci. Technol. 24(3), 544–556 (2009)
Article Google Scholar
Saha, S., Bandyopadhyay, S.: A new multiobjective simulated annealing based clustering technique using symmetry. Pattern Recognit. Lett. 30(15), 1392–1403 (2009)
Article Google Scholar
Saha, S., Bandyopadhyay, S.: A new point symmetry based fuzzy genetic clustering technique for automatic evolution of clusters. Inf. Sci. 179(19), 3230–3246 (2009)
Article MATH Google Scholar
Saha, S., Bandyopadhyay, S.: A new symmetry based multiobjective clustering technique for automatic evolution of clusters. Pattern Recognit. 43(3), 738–751 (2010)
Article MATH Google Scholar
Saha, S., Bandyopadhyay, S.: A generalized automatic clustering algorithm in a multiobjective framework. Appl. Soft Comput. 13(1), 89–108 (2013)
Article Google Scholar
Saha, S., Bandyopadhyay, S.: Application of a new symmetry based cluster validity index for satellite image segmentation. IEEE Geosci. Remote Sens. Lett. 5(2), 166–170 (2008)
Article Google Scholar
Saha, S., Bandyopadhyay, S.: Performance evaluation of some symmetry based cluster validity indices. IEEE Trans. Syst. Man Cybern., Part C, Appl. Rev. 39(4), 420–425 (2009)
Article Google Scholar
Saha, S., Bandyopadhyay, S.: A validity index based on connectivity. In: ICAPR, pp. 91–94. IEEE Comput. Soc., Los Alamitos (2009)
Google Scholar
Saha, S., Bandyopadhyay, S.: On principle axis based line symmetry clustering techniques. Memetic Comput. 3(2), 129–144 (2011)
Article Google Scholar
Saha, S., Maulik, U.: Use of symmetry and stability for data clustering. Evol. Intell. 3(3-4), 103–122 (2010)
Article MATH Google Scholar
Sander, J., Ester, M., Kriegel, H.P., Xu, X.: Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications. Data Min. Knowl. Discov. 2(2), 169–194 (1998)
Article Google Scholar
Schaffer, J.: Multiple objective optimization with vector evaluated genetic algorithms. In: Genetic Algorithms and Their Applications: Proceedings of the First International Conference on Genetic Algorithms, pp. 93–100 (1985)
Google Scholar
Schott, J.R.: Fault tolerant design using single and multi-criteria genetic algorithms. Master’s thesis, Department of Aeronautics and Astronautics, Massachusetts Institute of Technology, Boston, MA (1995)
Google Scholar
Selim, S.Z., Ismail, M.A.: K-means type algorithms: A generalized convergence theorem and characterization of local optimality. IEEE Trans. Pattern Anal. Mach. Intell. 6, 81–87 (1984)
Article MATH Google Scholar
Serafini, P.: Simulated annealing for multiple objective optimization problems. In: Proceedings of the Tenth International Conference on Multiple Criteria Decision Making: Expand and Enrich the Domains of Thinking and Application, vol. 1, pp. 283–292. Springer, Berlin (1994)
Google Scholar
Sheng, W., Liu, X.: A hybrid algorithm for k-medoid clustering of large data sets. In: Proceedings of IEEE Congress on Evolutionary Computation, pp. 77–82 (2004)
Google Scholar
Sheng, W., Swift, S., Zhang, L., Liu, X.: A weighted sum validity function for clustering with a hybrid niching genetic algorithm. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 35(6), 56–67 (2005)
Article Google Scholar
Siedlecki, W., Sklansky, J.: A note on genetic algorithms for large-scale feature selection. Pattern Recognit. Lett. 10, 335–347 (1989). http://dl.acm.org/citation.cfm?id=78354.78362
Article MATH Google Scholar
Alves, V.S., Campello, R.J.G.B., Hruschka, E.R.: Towards a fast evolutionary algorithm for clustering. In: Proc. IEEE Congress on Evolutionary Computation, pp. 6240–6247 (2006)
Google Scholar
Smith, K.I., Everson, R.M., Fieldsend, J.E.: Dominance measures for multi-objective simulated annealing. In: Proceedings of the 2004 IEEE Congress on Evolutionary Computation (CEC’04), pp. 23–30 (2004)
Google Scholar
Smith, K.I., Everson, R.M., Fieldsend, J.E., Murphy, C., Misra, R.: Dominance-based multi-objective simulated annealing. IEEE Trans. Evol. Comput. 12(3), 323–342 (2008)
Article Google Scholar
Sontag, E., Sussman, H.: Image restoration and segmentation using the annealing algorithm. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. CAD-2(4), 215–222 (1983)
Google Scholar
Spath, H.: Cluster Analysis Algorithms. Ellis Horwood, Chichester (1989)
Google Scholar
Srinivas, N., Deb, K.: Multiobjective optimization using nondominated sorting in genetic algorithms. Evol. Comput. 2(3), 221–248 (1994)
Article Google Scholar
Srinivas, M., Patnaik, L.M.: Adaptive probabilities of crossover and mutation in genetic algorithms. IEEE Trans. Syst. Man Cybern. 24(4), 656–667 (1994)
Article Google Scholar
Staiano, A., Tagliaferri, R., Pedrycz, W.: Improving RBF networks performance in regression tasks by means of a supervised fuzzy clustering. Neurocomputing 69(13-15), 1570–1581 (2006)
Article Google Scholar
Storn, R., Price, K.: Differential evolution – A simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997)
Article MathSciNet MATH Google Scholar
Su, M.C., Chou, C.H.: Application of associative memory in human face detection. In: 1999 International Joint Conference on Neural Networks, pp. 3194–3197 (1999)
Google Scholar
Su, M.C., Chou, C.H.: A modified version of the K-means algorithm with a distance based on cluster symmetry. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 674–680 (2001)
Article Google Scholar
Su, M.C., DeClaris, N., Kang Liu, T.: Application of neural networks in cluster analysis. In: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 1997, ‘Computational Cybernetics and Simulation’, Orlando, FL, vol. 1, pp. 1–6 (1997)
Google Scholar
Su, M.C., Liu, Y.C.: A new approach to clustering data with arbitrary shapes. Pattern Recognit. 38, 1887–1901 (2005)
Article MATH Google Scholar
Suckling, J., Sigmundsson, T., Greenwood, K., Bullmore, E.: A modified fuzzy clustering algorithm for operator independent brain tissue classification of dual echo MR images. J. Magn. Reson. Imaging 17(7), 1065–1076 (1999)
Article Google Scholar
Suman, B.: Study of self-stopping PDMOSA and performance measure in multiobjective optimization. Comput. Chem. Eng. 29(5), 1131–1147 (2005)
Article Google Scholar
Suman, B.: Multiobjective simulated annealing – A metaheuristic technique for multiobjective optimization of a constrained problem. Found. Comput. Dec. Sci. 27(3), 171–191 (2002)
Google Scholar
Suman, B.: Simulated annealing based multiobjective algorithm and their application for system reliability. Eng. Optim. 35(4), 391–416 (2003)
Article Google Scholar
Suman, B.: Study of simulated annealing based multiobjective algorithm for multiobjective optimization of a constrained problem. Comput. Chem. Eng. 28(9), 1849–1871 (2004)
Article Google Scholar
Suman, B., Kumar, P.: A survey of simulated annealing as a tool for single and multiobjective optimization. J. Oper. Res. Soc. 57(10), 1143–1160 (2006)
Article MATH Google Scholar
Suppapitnarm, A., Seffen, K., Parks, G., Clarkson, P.: A simulated annealing algorithm for multiobjective optimization. Eng. Optim. 33(1), 59–85 (2000)
Article Google Scholar
Szu, H.H., Hartley, R.L.: Fast simulated annealing. Phys. Lett. A 122(3–4), 157–162 (1987)
Article Google Scholar
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Springer, Berlin (2005)
Google Scholar
Teknomo, K.: Similarity measurement. http://people.revoledu.com/kardi/tutorial/Similarity/
Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 3rd edn. Academic Press, Orlando (2006)
MATH Google Scholar
Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters via the gap statistics. J. R. Stat. Soc. 63, 411–423 (2001)
Article MathSciNet MATH Google Scholar
Tou, J.T., Gonzalez, R.C.: Pattern Recognition Principles. Addison-Wesley, Reading (1974)
MATH Google Scholar
Toussaint, G.T.: The relative neighborhood graph of a finite planar set. Pattern Recognit. 12, 261–268 (1980)
Article MathSciNet MATH Google Scholar
Toussaint, G.T.: Pattern recognition and geometrical complexity. In: Proc. Fifth International Conf. on Pattern Recognition, Miami Beach, December 1980, pp. 1324–1347 (1980)
Google Scholar
Tseng, L., Yang, S.: Genetic algorithms for clustering, feature selection, and classification. In: Proceedings of the IEEE International Conference on Neural Networks, Houston, pp. 1612–1616 (1997)
Google Scholar
Tuyttens, D., Teghem, J., El-Sherbeny, N.: A particular multiobjective vehicle routing problem solved by simulated annealing. In: Metaheuristics for Multiobjective Optimization, vol. 535, 133–152 (2003)
Chapter Google Scholar
Ulungu, E.L., Teghaem, J., Fortemps, P., Tuyttens, D.: MOSA method: A tool for solving multiobjective combinatorial decision problems. J. Multi-Criteria Decis. Anal. 8(4), 221–236 (1999)
Article MATH Google Scholar
Varma, S., Asharaf, S., Murty, M.N.: Rough core vector clustering. In: PReMI, pp. 304–310 (2007)
Google Scholar
Vijaya, P.A., Murty, M.N., Subramanian, D.K.: Leaders-subleaders: An efficient hierarchical clustering algorithm for large data sets. Pattern Recognit. Lett. 25(4), 505–513 (2004)
Article Google Scholar
Vijaya, P.A., Murty, M.N., Subramanian, D.K.: An efficient hybrid hierarchical agglomerative clustering (HHAC) technique for partitioning large data sets. In: PReMI, pp. 583–588 (2005)
Google Scholar
Wang, W., Zhang, Y.: On fuzzy cluster validity indices. Fuzzy Sets Syst. 158(19), 2095–2117 (2007)
Article MATH Google Scholar
Wang, W., Yang, J., Muntz, R.R.: STING: A statistical information grid approach to spatial data mining. In: Proceedings of the 23rd International Conference on Very Large Data Bases, VLDB’97, pp. 186–195. Morgan Kaufmann, San Francisco (1997). http://dl.acm.org/citation.cfm?id=645923.758369
Google Scholar
Wang, W., Yang, J., Muntz, R.R.: STING+: An approach to active spatial data mining. In: Proceedings of the 15th IEEE International Conference on Data Engineering, Sydney, Australia, March 1999, pp. 116–125 (1999)
Google Scholar
Webb, A.: Statistical Pattern Recognition. Wiley, Chichester (2002)
Book MATH Google Scholar
Wong, C., Chen, C., Su, M.: A novel algorithm for data clustering. Pattern Recognit. 34, 425–442 (2001)
Article MATH Google Scholar
Xie, X.L., Beni, G.: A validity measure for fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 13(8), 841–847 (1991)
Article Google Scholar
Yan, H.: Fuzzy curve-tracing algorithm. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 31(5), 768–780 (2001)
Article Google Scholar
Yip, A.M., Ding, C., Chan, T.F.: Dynamic cluster formation using level set methods. IEEE Trans. Pattern Anal. Mach. Intell. 28(6), 877–889 (2006)
Article Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: An efficient data clustering method for very large databases. In: Proc. of the 1996 ACM SIGMOD International Conference on Management of Data, pp. 103–114 (1996)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, West Bengal, India
Sanghamitra Bandyopadhyay
Dept. of Computer Science, Indian Institute of Technology, Patna, India
Sriparna Saha

Authors

Sanghamitra Bandyopadhyay
View author publications
You can also search for this author in PubMed Google Scholar
Sriparna Saha
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bandyopadhyay, S., Saha, S. (2013). Clustering Algorithms. In: Unsupervised Classification. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32451-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-32451-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32450-5
Online ISBN: 978-3-642-32451-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics