Advertisement

2DBSCAN with Local Outlier Detection

  • Urja Pandya
  • Vidhi Mistry
  • Anjana Rathwa
  • Himani Kachroo
  • Anjali Jivani
Conference paper
  • 11 Downloads
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1097)

Abstract

This research is related to designing a new algorithm which is based on the existing DBSCAN algorithm to improve the quality of clustering. DBSCAN algorithm categorizes each data object as either a core point, a border point or a noise point. These points are classified based on the density determined by the input parameters. However, in DBSCAN algorithm, a border point is designated the same cluster as its core point. This leads to a disadvantage of DBSCAN algorithm which is popularly known as the problem of transitivity. The proposed algorithmーtwo DBSCAN with local outlier detection (2DBSCAN-LOD), tries to address this problem. Average silhouette width score is used here to compare the quality of clusters formed by both algorithms. By testing 2DBSCAN-LOD on different artificial datasets, it is found that the average silhouette width score of clusters formed by DBSCAN-LOD is higher than that of the clusters formed by DBSCAN.

Keywords

DBSCAN Clustering Border points Local outliers Global outliers 

References

  1. 1.
    Ghuman, Sukhdev Singh. 2016. Clustering techniques—a review.Google Scholar
  2. 2.
    Khan, M.M.R., M.A.B. Siddique, R.B. Arif, and M.R. Oishe. 2018. ADBSCAN: adaptive density-based spatial clustering of applications with noise for ıdentifying clusters with varying densities.Google Scholar
  3. 3.
    Martin, E., Hans-Peter Kriegel, Jörg Sander, and X. Xiaowei.1996. A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise.Google Scholar
  4. 4.
    Campello, Ricardo J.G.B., Davoud Moulavi, Arthur Zimek, and Jörg Sander. 2015. Hierarchical density estimates for data clustering, visualization, and outlier detection.Google Scholar
  5. 5.
    Chepenko, D. 2018. A Density-based algorithm for outlier detection. Medium. https://towardsdatascience.com/density-based-algorithm-for-outlier-detection-8f278d2f7983.
  6. 6.
    Rousseeuw, P.J. 1987. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis.Google Scholar
  7. 7.
    Barton, T. 2015. Clustering-benchmark. Github. https://github.com/deric/clustering-benchmark.
  8. 8.
    Bandyopadhyay, S., and U. Maulik. 2002. Nonparametric genetic clustering: comparison validity indices.Google Scholar
  9. 9.
    Bhattacharyya, S. 2019. DBSCAN algorithm: complete guide and application with python scikit-learn, clustering spatial database. https://towardsdatascience.com/dbscan-algorithm-complete-guide-and-application-with-python-scikit-learn-d690cbae4c5d.
  10. 10.
  11. 11.

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Urja Pandya
    • 1
  • Vidhi Mistry
    • 1
  • Anjana Rathwa
    • 1
  • Himani Kachroo
    • 1
  • Anjali Jivani
    • 1
  1. 1.Department of Computer Science and EngineeringThe Maharaja Sayajirao University of BarodaVadodaraIndia

Personalised recommendations