The modern world has witnessed a surge in technological advancements that span various industries. In some sectors, such as search engines, bioinformatics, and pattern recognition, software applications typically deal with having to interpret shear amounts of data in an attempt to discover patterns that may provide great value for business analysis, development, and planning. This emphasized the importance of fields of study such as clustering, a descendant discipline of data mining, which gained momentum in recent decades. Clustering addresses this very problem of analyzing large datasets and attempting to unravel data distributions and patterns by means of a mostly unsupervised data classification [9]. Example clustering applications include multimedia analysis and retrieval [10], pattern recognition [15], and bioinformatics [5].
This chapter starts by providing an overview of existing clustering approaches. Then, it defines key concepts that are utilized by the PYRAMID algorithm. It also presents the experiments that were conducted in Tout et al. [23] as well as other experiments using various datasets that were employed in Sheikholeslami et al. [21] featuring different challenges. Finally, it explores the independence of PYRAMID on user-supplied parameters and outlines future research directions.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Berkhin, P. (2002). Survey of clustering data mining techniques. Accrue Software. Retrieved February 28, 2005, from http://www.ee.ucr.edu/~barth/EE242/clustering_survey.pdf.
Berry, M.J. and Linoff, G. (1997). Data Mining Techniques: For Marketing, Sales, and Customer Support. New York: John Wiley and Sons.
Davis, L. (1991). Handbook of Genetic Algorithms. New York: Van Nostrand Reinhold.
Deb, K. (2001). Multi-Objective Optimization Using Evolutionary Algorithms. New York: John Wiley and Sons.
Dettling, M. and Bühlmann, P. (2002). Supervised clustering of genes. Genome Biology, 3(12), 39–50.
Dorai, C. and Jain, A.K. (1995). Shape spectra based view grouping for free-form objects. Proceedings of the International Conference on Image Processing, Washington, DC, 3, 340–343.
Ester, M., Kriegel, H.P., Sander, J., and Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, 226–231.
Guha, S., Rastogi, R., and Shim, K. (1998). CURE: An efficient clustering algorithm for large databases. Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, WA, 73–84.
Han, J., and Kamber, M. (2001). Data Mining, Concepts and Techniques. San Francisco: Morgan Kaufmann.
Hinneburg, A., and Keim, D.A. (1998). An efficient approach to clustering in large multimedia databases with noise. Proceedings of the Fourth International Conference on Knowledge Discovery in Databases, New York, 58–65.
Jain, A.K., Murty, M., and Flynn, P. (1999). Data clustering: A review. ACM Computing Surveys, 31(3), 264–323.
Karypis, G., Han, S., and Kumar, V. (1999). Chameleon: A hierarchical clustering using dynamic modeling. IEEE Computer: Data Analysis and Mining (Special Issue), 32(8), 68–75.
Kaufman, L. and Rousseeuw, P.J. (1990). Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley and Sons, Inc.
Kolatch, E. (2001). Clustering Algorithms for Spatial Databases: A Survey (Technical Report No. CMSC 725). Department of Computer Science, University of Maryland, College Park, 1–22.
Koza, J.R. (1991). Evolving a computer program to generate random numbers using the genetic programming paradigm. Proceedings of the Fourth International Conference on Genetic Algorithms, La Jolla, CA, 37–44.
Ohsawa, Y. and Nagashima, A. (2001). A spatio-temporal geographic information system based on implicit topology description:STIMS. Proceedings of the Third International Society for Photogrammetry and Remote Sensing (ISPRS) Workshop on Dynamic and Multi-Dimensional Geographic Information System, Thailand, 218–223.
Rasmussen, E. (1992). Clustering algorithms. Information Retrieval: Data Structures and Algorithms, 419–442. Upper Saddle River, NJ: Prentice-Hall.
Ripley, B.D. (1996). Pattern Recognition and Neural Networks. Cambridge, MA: Cambridge University Press.
Sarafis, I., Zalzala, A., and Trinder, P. (2002). A genetic rule-based data clustering toolkit. Proceedings of the 2002 World Congress on Evolutionary Computation, Honolulu, 1238–1243.
Sarafis, I., Zalzala, A., and Trinder, P. (2003). Mining comprehensive clustering rules with an evolutionary algorithm. Proceedings of the Genetic and Evolutionary Computation Conference, Chicago, 1–12.
Sheikholeslami, G., Chatterjee, S., and Zhang, A. (1998). WaveCluster: A multi-resolution clustering approach for very large spatial databases. Proceedings of the 24th International Conference on Very Large Data Bases, New York, 428–439.
Solberg, A., Taxt, T., and Jain, A. (1996). A Markov random field model for classification of multisource satellite imagery. IEEE Transactions on Geoscience and Remote Sensing, 34(1), 100–113.
Tout, S., Sverdlik, W., and Sun, J. (2006). Parallel hybrid clustering using genetic programming and multi-objective fitness with density (PYRAMID). Proceedings of the 2006 International Conference on Data Mining (DMIN’06), Las Vegas, NV, 197–203.
Wang, W., Yang, J., and Muntz, R. (1997). STING: A statistical information grid approach to spatial data mining. Proceedings of the 1997 International Conference on Very Large Data Bases, Athens, 186–195.
Zhang, T., Ramakrishnan, R., and Livny, M. (1996). BIRCH: An efficient data clustering method for very large databases. Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, 103–114.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Sun, J., Sverdlik, W., Tout, S. (2008). A Hybrid Evolutionary Approach to Cluster Detection. In: Castillo, O., Xu, L., Ao, SI. (eds) Trends in Intelligent Systems and Computer Engineering. Lecture Notes in Electrical Engineering, vol 6. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-74935-8_42
Download citation
DOI: https://doi.org/10.1007/978-0-387-74935-8_42
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-74934-1
Online ISBN: 978-0-387-74935-8
eBook Packages: EngineeringEngineering (R0)