Abstract
Density-Based Spatial Clustering of Applications with Noise (DBSCAN), as one of the classic density-based clustering algorithms, has the advantage of identifying clusters with different shapes, and it has been widely used in clustering analysis. Due to the DBSCAN algorithm using globally unique parameters ɛ and MinPts, the correct number of classes can not be obtained when clustering the unbalanced data, consequently, the clustering effect is not satisfactory. To solve this problem, this paper proposes a clustering algorithm LP-DBSCAN which uses local parameters for unbalanced data. The algorithm divides the data set into multiple data regions by DPC algorithm. And the size and shape of each data region depends on the density characteristics of the sample. Then for each data region, set the appropriate parameters for local clustering, and finally merge the data regions. The algorithm is simple and easy to implement. The experimental results show that this algorithm can solve the problems of DBSCAN algorithm and can deal with arbitrary shape data and unbalanced data. Especially in dealing with unbalanced data, the clustering effect is obviously better than other algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Sun, J., Liu, J., Zhao, L.: Clustering algorithms research. J. Softw. 19(1), 48–61 (2008)
Agarwal, S.: Data mining: data mining concepts and techniques. In: International Conference on Machine Intelligence and Research Advancement, pp. 203–207. IEEE (2014)
Witten, I.H., Frank, E.: Data mining: practical machine learning tools and techniques. ACM SIGMOD Rec. 31(1), 76–77 (2011)
Hu, Z., Hongye, T., Yuhua, Q.: Chinese text deception detection based on ensemble learning. J. Comput. Res. Dev. 52(5), 1005–1013 (2015)
Chu, X.: K-means clustering algorithm and artificial fish swam algorithm applied in image segmentation technology. Comput. Syst. Appl. 22(4), 92–94 (2013)
Ester, M., Kriegel, H., Sander, J., et al.: A density-based algorithm for discovering clusters in large spatial databases (1996)
Xiong, Z., Sun, S., Zhang, Y.: Partition-based DBSCAN algorithm with different parameter. Comput. Eng. Des. 26(9), 2319–2321 (2005)
Zhou, S., Zhou, A., Cao, J.: A data-partitioning-based DBSCAN algorithm. J. Comput. Res. Dev. 37(10), 1153–1159 (2000)
Patwary, M.A., Liao, W., Manne, F., et al.: A new scalable parallel DBSCAN algorithm using the disjoint-set data structure, pp. 1–11 (2012)
Li, Y., Ma, L., Fan, F.: Improved DBSCAN clustering algorithm based on dynamic neighbor. Comput. Eng. Appl. 52(20), 80–85 (2016)
Rodriguez, A., Laio, A.: Machine learning. Clustering by fast search and find of density peaks. Science 344(6191), 1492 (2014)
Dai, B.R., Lin, I.C.: Efficient map/reduce-based DBSCAN algorithm with optimized data partition. In: IEEE, International Conference on Cloud Computing, pp. 59–66. IEEE (2012)
Sancho-Asensio, A., Navarro, J., Arrieta-Salinas, I., et al.: Improving data partition schemes in Smart Grids via clustering data streams. Expert Syst. Appl. 41(13), 5832–5842 (2014)
Morrison, R.E., Bryant, C.M., Terejanu, G., et al.: Data partition methodology for validation of predictive models. Comput. Math Appl. 66(10), 2114–2125 (2013)
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972 (2007)
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823. IEEE Computer Society (2015)
Jahirabadkar, S., Kulkarni, P.: Algorithm to determine ε-distance parameter in density based clustering. Expert Syst. Appl. 41(6), 2939–2946 (2014)
Ren, Y., Liu, X., Liu, W.: DBCAMM: a novel density based clustering algorithm via using the Mahalanobis metric. Appl. Soft Comput. 12(5), 1542–1554 (2012)
Guha, S., Rastogi, R., Shim, K., et al.: CURE: an efficient clustering algorithm for large databases. Inf. Syst. 26(1), 35–58 (2001)
Cacciari, M., Salam, G.P., Soyez, G.: The anti-k_t jet clustering algorithm. J. High Energy Phys. 04(4), 403–410 (2008)
Pal, N.R., Pal, K., Keller, J.M., et al.: A possibilistic fuzzy c-means clustering algorithm. IEEE Trans. Fuzzy Syst. 13(4), 517–530 (2005)
Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 40(1), 200–210 (2012)
Tran, T.N., Drab, K., Daszykowski, M.: Revised DBSCAN algorithm to cluster data with dense adjacent clusters. Chemometr. Intell. Lab. Syst. 120(2), 92–96 (2013)
Acknowledgements
As time goes by, I have spent most of my postgraduate studies. Firstly I must thank my mentor for guiding me to get started and motivating me to move forward. Thank him for giving me help in the research ideas of the paper. Secondly, I will thank to the professional brothers in the lab for helping me to revise my paper and for giving me guidance on experiments. Finally, I will sincerely thank to all the experts and professors who help me review the manuscript during the busy schedule.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Diao, K., Liang, Y., Fan, J. (2018). An Improved DBSCAN Algorithm Using Local Parameters. In: Zhou, ZH., Yang, Q., Gao, Y., Zheng, Y. (eds) Artificial Intelligence. ICAI 2018. Communications in Computer and Information Science, vol 888. Springer, Singapore. https://doi.org/10.1007/978-981-13-2122-1_1
Download citation
DOI: https://doi.org/10.1007/978-981-13-2122-1_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2121-4
Online ISBN: 978-981-13-2122-1
eBook Packages: Computer ScienceComputer Science (R0)