2DBSCAN with Local Outlier Detection
- 11 Downloads
This research is related to designing a new algorithm which is based on the existing DBSCAN algorithm to improve the quality of clustering. DBSCAN algorithm categorizes each data object as either a core point, a border point or a noise point. These points are classified based on the density determined by the input parameters. However, in DBSCAN algorithm, a border point is designated the same cluster as its core point. This leads to a disadvantage of DBSCAN algorithm which is popularly known as the problem of transitivity. The proposed algorithmーtwo DBSCAN with local outlier detection (2DBSCAN-LOD), tries to address this problem. Average silhouette width score is used here to compare the quality of clusters formed by both algorithms. By testing 2DBSCAN-LOD on different artificial datasets, it is found that the average silhouette width score of clusters formed by DBSCAN-LOD is higher than that of the clusters formed by DBSCAN.
KeywordsDBSCAN Clustering Border points Local outliers Global outliers
- 1.Ghuman, Sukhdev Singh. 2016. Clustering techniques—a review.Google Scholar
- 2.Khan, M.M.R., M.A.B. Siddique, R.B. Arif, and M.R. Oishe. 2018. ADBSCAN: adaptive density-based spatial clustering of applications with noise for ıdentifying clusters with varying densities.Google Scholar
- 3.Martin, E., Hans-Peter Kriegel, Jörg Sander, and X. Xiaowei.1996. A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise.Google Scholar
- 4.Campello, Ricardo J.G.B., Davoud Moulavi, Arthur Zimek, and Jörg Sander. 2015. Hierarchical density estimates for data clustering, visualization, and outlier detection.Google Scholar
- 5.Chepenko, D. 2018. A Density-based algorithm for outlier detection. Medium. https://towardsdatascience.com/density-based-algorithm-for-outlier-detection-8f278d2f7983.
- 6.Rousseeuw, P.J. 1987. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis.Google Scholar
- 7.Barton, T. 2015. Clustering-benchmark. Github. https://github.com/deric/clustering-benchmark.
- 8.Bandyopadhyay, S., and U. Maulik. 2002. Nonparametric genetic clustering: comparison validity indices.Google Scholar
- 9.Bhattacharyya, S. 2019. DBSCAN algorithm: complete guide and application with python scikit-learn, clustering spatial database. https://towardsdatascience.com/dbscan-algorithm-complete-guide-and-application-with-python-scikit-learn-d690cbae4c5d.
- 10.DBSCAN Wikipedia. https://en.wikipedia.org/wiki/DBSCAN.
- 11.Silhouette Width Wikipedia. https://en.wikipedia.org/wiki/Silhouette_(clustering).