Applied Intelligence

, Volume 48, Issue 2, pp 432–444 | Cite as

A clustering algorithm with affine space-based boundary detection

Article
  • 617 Downloads

Abstract

Clustering is an important technique in data mining. The innovative algorithm proposed in this paper obtains clusters by first identifying boundary points as opposed to existing methods that calculate core cluster points before expanding to the boundary points. To achieve this, an affine space-based boundary detection algorithm was employed to divide data points into cluster boundary and internal points. A connection matrix was then formed by establishing neighbor relationships between internal and boundary points to perform clustering. Our clustering algorithm with an affine space-based boundary detection algorithm accurately detected clusters in datasets with different densities, shapes, and sizes. The algorithm excelled at dealing with high-dimensional datasets.

Keywords

Data mining Clustering algorithm Boundary detection Affine space 

Notes

Acknowledgements

This work was supported by Basic and Advanced Technology Research Project of Henan Province (Grant No. 152300410191).

References

  1. 1.
    Liu X, Li M (2014) Integrated constraint based clustering algorithm for high dimensional data. Neurocomputing 142:478–485CrossRefGoogle Scholar
  2. 2.
    Denoeux T, Kanjanatarakul O, Sriboonchitta S (2015) Ek-nnclus: A clustering procedure based on the evidential k-nearest neighbor rule. Knowl-Based Syst 88:57–69CrossRefGoogle Scholar
  3. 3.
    Zhang J, Lin Y, Lin M, Liu J (2016) An effective collaborative filtering algorithm based on user preference clustering. Appl Intell 1–11Google Scholar
  4. 4.
    Wahyu A, Purwarianti A, Le HS (2015) Fuzzy geographically weighted clustering using artificial bee colony: an efficient geo-demographic analysis algorithm and applications to the analysis of crime behavior in population. Appl Intell 43(2):1–22Google Scholar
  5. 5.
    Bdiri T, Bouguila N, Ziou D (2016) Variational bayesian inference for infinite generalized inverted dirichlet mixtures with feature selection and its application to clustering. Appl Intell 44(3):507–525CrossRefGoogle Scholar
  6. 6.
    Adamek M, Spohn M, Stegmann E, Ziemert N (2017) Mining bacterial genomes for secondary metabolite gene clusters. In: Antibiotics. SpringerGoogle Scholar
  7. 7.
    Hung T-Y, Vaikundam S, Natarajan V, Chia L-T (2017) Phase fourier reconstruction for anomaly detection on metal surface using salient irregularity, MultiMedia Modeling, MMM. Lecture notes in computer science, vol 10132. Springer, ChamGoogle Scholar
  8. 8.
    Beauchemin M (2015) A density-based similarity matrix construction for spectral clustering. Neurocomputing 151(Part 2):835– 844CrossRefGoogle Scholar
  9. 9.
    Wu J, Wang F, Xiang P (2016) Automatic network clustering via density-constrained optimization with grouping operator. Appl Soft Comput 38:606–616CrossRefGoogle Scholar
  10. 10.
    Abdullah M, Eldin HN, Al-Moshadak T, Alshaik R, Al-Anesi I (2015) Density grid-based clustering for wireless sensors networks. Proced Comput Sci 65:35–47. International conference on communications, management, and information technology (ICCMIT’2015)CrossRefGoogle Scholar
  11. 11.
    Zhao Q, Shi Y, Liu Q, Franti P (2015) A grid-growing clustering algorithm for geo-spatial data. Pattern Recogn Lett 53:77–84CrossRefGoogle Scholar
  12. 12.
    Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques: concepts and techniques. ElsevierGoogle Scholar
  13. 13.
    Ester M, Kriegel HP, Sander J, Xu X A density-based algorithm for discovering clusters in large spatial databases with noiseGoogle Scholar
  14. 14.
    Cassisi C, Ferro A, Giugno R, Pigola G, Pulvirenti A (2013) Enhancing density-based clustering: parameter reduction and outlier detection. Inf Syst 38(3):317–330CrossRefGoogle Scholar
  15. 15.
    Lv Y, Ma T, Tang M, Cao J, Tian Y, Al-Dhelaan A, Al-Rodhaan M (2016) An efficient and scalable density-based clustering algorithm for datasets with complex structures. Neurocomputing 171:9–22CrossRefGoogle Scholar
  16. 16.
    Rastin P, Zhang T, Cabanes G A new clustering algorithm for dynamic data. Neural Inf ProcessGoogle Scholar
  17. 17.
    Wang P, Liu S, Liu M, Wang Q, Wang J, Zhang C The improved dbscan algorithm study on maize purity identification. Comput Comput Technol Agri VGoogle Scholar
  18. 18.
    Rodriguez A, Laio A (2014) Machine learning. Clustering by fast search and find of density peaks. Science 344(6191):1492–6CrossRefGoogle Scholar
  19. 19.
    He Y, Tan H, Luo W, Feng S, Fan J (2014) Mr-dbscan: a scalable mapreduce-based dbscan algorithm for heavily skewed data. Front Comput Sci 8(1):83MathSciNetCrossRefGoogle Scholar
  20. 20.
    Li Y, Guo C, Shi R, Liu X, Mei Y Dbscan-m: an intelligent clustering algorithm based on mutual reinforcement. Algor Arch Parallel ProcessGoogle Scholar
  21. 21.
    Soleimani BH, Matwin S, Souza EN A density-penalized distance measure for clustering. Adv Artif IntellGoogle Scholar
  22. 22.
    Yuan H, Wang S, Yu Y, Zhong M Dappfc: density-based affinity propagation for parameter free clustering. Adv Data Min ApplGoogle Scholar
  23. 23.
    Zhang Y, Wang X, Li B, Chen W, Wang T, Lei K Dboost: a fast algorithm for dbscan-based clustering on high dimensional data. Adv Knowl Discovert Data MinGoogle Scholar
  24. 24.
    Akbari Z, Unland R Automated determination of the input parameter of dbscan based on outlier detection. Artif Intell Appl InnovGoogle Scholar
  25. 25.
    Ienco D, Bordogna G (2016) Fuzzy extensions of the dbscan clustering algorithm. Soft Comput 1Google Scholar
  26. 26.
    Xia C, Hsu W, Lee M L, Ooi BC (2006) Border: efficient computation of boundary points. IEEE Trans Knowl Data Engi 18(3):289–303CrossRefGoogle Scholar
  27. 27.
    Lin K-M, Ehrgott M, Raith A (2016) Integrating column generation in a method to compute a discrete representation of the non-dominated set of multi-objective linear programmes. 4OR:1Google Scholar
  28. 28.
    Mély DA, Serre T (2017) Towards a theory of computation in the visual cortex. In: Computational and cognitive neuroscience of vision. SpringerGoogle Scholar
  29. 29.
    Hsu CM, Chen MS (2004) Subspace clustering of high dimensional spatial data with noises. In: Advances in knowledge discovery and data mining. Springer, pp 31–40Google Scholar
  30. 30.
    Chang CC, Lin CJ (2007) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3, article 27):389–396Google Scholar
  31. 31.
    Abdolrazzaghi M, Hashemy S, Abdolali A (2016) Fast-forward solver for inhomogeneous media using machine learning methods: artificial neural network, support vector machine and fuzzy logic. Neural Comput Appl 1Google Scholar
  32. 32.
    Lin H, Deng JD, Woodford BJ Shot boundary detection using multi-instance incremental and decremental one-class support vector machine. Adv Knowl Discov Data MinGoogle Scholar
  33. 33.
    Zhu F, Yang J, Xu S, Gao C, Ye N, Yin T (2016) Relative density degree induced boundary detection for one-class svm. Soft Comput 20(11):4473CrossRefGoogle Scholar
  34. 34.
    Catoni O (2015) Pac-bayes bounds for supervised classification. In: Measures of complexity. SpringerGoogle Scholar
  35. 35.
    Li X, Wang B, Liu Y, Lee TS (2015) Stochastic feature mapping for pac-bayes classification. Mach Learn 101(1–3):5MathSciNetCrossRefMATHGoogle Scholar
  36. 36.
    Jiang Y, Liu X (2016) Experimental and numerical investigation of density current over macro roughness. Environ Fluid Mech 1Google Scholar
  37. 37.
    Lemon J, Kockara S, Halic T, Mete M (2015) Density-based parallel skin lesion border detection with webcl. BMC Bioinf 16(13):1Google Scholar
  38. 38.
    Qiu BZ, Yue F, Shen JY (2007) Brim: an efficient boundary points detecting algorithm. In: Advances in knowledge discovery and data mining. Springer, pp 761–768Google Scholar
  39. 39.
    Xue LX, Qiu BZ (2009) Boundary points detection algorithm based on coefficient of variation. Pattern Recog Artif Intell 22(5):799–802Google Scholar
  40. 40.
    Qiu BZ, Yang Y, Du XW (2012) Brink: an algorithm of boundary points of clusters detecton based on local qualitative factors. J Zhengzhou Univ (Eng Sci) 33(3):117–120MathSciNetGoogle Scholar
  41. 41.
    Qiu BZ, Yang Y, Geng P (2015) Clustering boundary detection technology for mixed attribute dataset. Control Decis 1:171–175Google Scholar
  42. 42.
    Gallier J (2011) Basics of affine geometry. In: Geometric methods and applications. Springer, pp 7–63Google Scholar
  43. 43.
    Rockafellar RT (2015) Convex analysis. Princeton University PressGoogle Scholar
  44. 44.
    Xia SY, Xiong ZY, He Y (2014) Relative density-based classification noise detection. Optik 125:6829–6834CrossRefGoogle Scholar
  45. 45.
    Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290:2323–2326CrossRefGoogle Scholar
  46. 46.
    Ritter GX, Urcid G, Schmalz MS (2009) Autonomous single-pass endmember approximation using lattice auto-associative memories. Neurocomputing 72(10):2101–2110CrossRefGoogle Scholar
  47. 47.
    Karypis G, Han EH, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8):68–75CrossRefGoogle Scholar
  48. 48.
    Lécun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  49. 49.
    He S, Yang Q, Lau RWH, Yang MH (2015) Fast weighted histograms for bilateral filtering and nearest neighbor searching. IEEE Trans Circ Syst Vid Technol PP(99):1Google Scholar
  50. 50.
    Liu SG, Wei YW (2015) Fast nearest neighbor searching based on improved vp-tree. Pattern Recogn Lett 60(C):8–15CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.School of Information EngineeringZhengzhou UniversityZhengzhouChina

Personalised recommendations