Abstract
In this paper, a Gödel number-based encoding technique is proposed to encode each object of a dataset before applying any clustering algorithm. This encoding technique converts the objects into a decimal string while maintaining the properties of the features. The results of all standard existing clustering algorithms after applying this encoding are evaluated based on benchmark metrics like, Silhouette Score, Davis Bouldin, Calinski Harabasz and Dunn Index. In comparison to the existing clustering algorithms if one uses Gödel number-based encoding over the dataset, it gives better performance.
This work is partially supported by Start-up Research Grant (File number: SRG/2022/002098), SERB, Govt. of India.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
UCI Machine Learning Repository, Center for Machine Learning and Intelligent Systems (2007). http://archive.ics.uci.edu/ml/index.php. Accessed January 2023
Abhishek, S., Dharwish, M., Das, A., Bhattacharjee, K.: A cellular automata based clustering technique for high-dimensional data. In: Das, S., Martinez, G.J. (eds.) ASCAT 2023. AISC, vol. 1443, pp. 37–51. Springer, Singapore (2023). https://doi.org/10.1007/978-981-99-0688-8_4
Caliński, T., Harabasz, J.A.: A dendrite method for cluster analysis. Commun. Stat. - Theory Methods 3, 1–27 (1974)
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)
Dunn, J.C.: Well separated clusters and fuzzy partitions. J. Cybern. 4, 95–104 (1974)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD 1996, pp. 226–231. AAAI Press (1996)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Knowledge Discovery and Data Mining (1996)
Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a K-Means clustering algorithm. Appl. Stat. 28(1), 100–108 (1979)
Martín-Fernández, F., Caballero-Gil, P.: Analysis of the new standard hash function. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds.) EUROCAST 2013. LNCS, vol. 8111, pp. 142–149. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-53856-8_18
Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Zepeda-Mendoza, M.L., Resendis-Antonio, O.: Hierarchical agglomerative clustering. In: Dubitzky, W., Wolkenhauer, O., Cho, K.H., Yokota, H. (eds.) Encyclopedia of Systems Biology, pp. 886–887. Springer, New York (2013). https://doi.org/10.1007/978-1-4419-9863-7_1371
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: a new data clustering algorithm and its applications. Data Min. Knowl. Disc. 1(2), 141–182 (1997)
Acknowledgment
The authors are grateful to Prof. Sukanta Das for his valuable comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Narodia Parth, P., Bhattacharjee, K. (2023). Gödel Number Based Encoding Technique for Effective Clustering. In: Maji, P., Huang, T., Pal, N.R., Chaudhury, S., De, R.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2023. Lecture Notes in Computer Science, vol 14301. Springer, Cham. https://doi.org/10.1007/978-3-031-45170-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-45170-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45169-0
Online ISBN: 978-3-031-45170-6
eBook Packages: Computer ScienceComputer Science (R0)