Advertisement

Distance based Incremental Clustering for Mining Clusters of Arbitrary Shapes

  • Bidyut Kr. Patra
  • Ollikainen Ville
  • Raimo Launonen
  • Sukumar Nandi
  • Korra Sathya Babu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8251)

Abstract

Clustering has been recognized as one of the important tasks in data mining. One important class of clustering is distance based method. To reduce the computational and storage burden of the classical clustering methods, many distance based hybrid clustering methods have been proposed. However, these methods are not suitable for cluster analysis in dynamic environment where underlying data distribution and subsequently clustering structures change over time. In this paper, we propose a distance based incremental clustering method, which can find arbitrary shaped clusters in fast changing dynamic scenarios. Our proposed method is based on recently proposed al-SL method, which can successfully be applied to large static datasets. In the incremental version of the al-SL (termed as IncrementalSL), we exploit important characteristics of al-SL method to handle frequent updates of patterns to the given dataset. The IncrementalSL method can produce exactly same clustering results as produced by the al-SL method. To show the effectiveness of the IncrementalSL in dynamically changing database, we experimented with one synthetic and one real world datasets.

Keywords

Incremental clustering arbitrary shaped clusters large datasets 

References

  1. 1.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proceedings of 2nd ACM SIGKDD, pp. 226–231 (1996)Google Scholar
  2. 2.
    Sneath, A., Sokal, P.H.: Numerical Taxonomy. Freeman, London (1973)zbMATHGoogle Scholar
  3. 3.
    Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recognition Letters 31(8), 651–666 (2010)CrossRefGoogle Scholar
  4. 4.
    Patra, B.K.: Mining Arbitrary Shaped Clusters in Large Dataset. PhD thesis, Indian Institute of Technology Guwahati, Guwahati, INDIA (2012)Google Scholar
  5. 5.
    Murty, M.N., Krishna, G.: A hybrid clustering procedure for concentric and chain-like clusters. Int. J. Comput. Inform. Sci. 10(6), 397–412 (1981)CrossRefGoogle Scholar
  6. 6.
    Wong, M.A.: A hybrid clustering algorithm for identifying high density clusters. Journal of the American Statistical Association 77(380), 841–847 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Vijaya, P.A., Murty, M.N., Subramanian, D.K.: Efficient bottom-up hybrid hierarchical clustering techniques for protein sequence classification. Pattern Recognition 39(12), 2344–2355 (2006)CrossRefzbMATHGoogle Scholar
  8. 8.
    Lin, C.R., Chen, M.S.: Combining partitional and hierarchical algorithms for robust and efficient data clustering with cohesion self-merging. IEEE Trans. on Knowl. and Data Eng. 17(2), 145–159 (2005)CrossRefGoogle Scholar
  9. 9.
    Chaoji, V., Hasan, M.A., Salem, S., Zaki, M.J.: Sparcl: an effective and efficient algorithm for mining arbitrary shape-based clusters. Knowl. Inf. Syst. 21(2), 201–229 (2009)CrossRefGoogle Scholar
  10. 10.
    Patra, B.K., Nandi, S., Viswanath, P.: A distance based clustering method for arbitrary shaped clusters in large datasets. Pattern Recognition 44(12), 2862–2870 (2011)CrossRefzbMATHGoogle Scholar
  11. 11.
    Hartigan, J.A.: Clustering Algorithms. John Wiley & Sons, Inc., New York (1975)zbMATHGoogle Scholar
  12. 12.
    Spath, H.: Cluster Analysis Algorithms for Data Reduction and Classification of Objects. Ellis Horwood, UK (1980)Google Scholar
  13. 13.
    Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 3rd edn. Academic Press, Inc., Orlando (2006)zbMATHGoogle Scholar
  14. 14.
    Zhang, T., Ramakrishnan, R., Livny, M.: Birch: An efficient data clustering method for very large databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data, SIGMOD 1996, pp. 103–114 (1996)Google Scholar
  15. 15.
    Charikar, M., Chekuri, C., Feder, T., Motwani, R.: Incremental clustering and dynamic information retrieval. In: Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing, pp. 626–635 (1997)Google Scholar
  16. 16.
    Chen, C.-Y., Hwang, S.-C., Oyang, Y.-J.: An incremental hierarchical data clustering algorithm based on gravity theory. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 237–250. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  17. 17.
    Widyantoro, D., Ioerger, T., Yen, J.: An incremental approach to building a cluster hierarchy. In: Proceedings of IEEE International Conference on Data Mining, ICDM 2003, pp. 705–708 (2002)Google Scholar
  18. 18.
    Ester, M., Kriegel, H.P., Sander, J., Wimmer, M., Xu, X.: Incremental clustering for mining in a data warehousing environment. In: Proceedings of 24th International Conference on Very Large Data Bases (VLDB 1998), pp. 323–333 (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Bidyut Kr. Patra
    • 1
    • 3
  • Ollikainen Ville
    • 1
  • Raimo Launonen
    • 1
  • Sukumar Nandi
    • 2
  • Korra Sathya Babu
    • 3
  1. 1.VTT Technical Research Centre of FinlandEspooFinland
  2. 2.Indian Institute of Technology GuwahatiGuwahatiIndia
  3. 3.RourkelaNational Institute of Technology RourkelaOdishaIndia

Personalised recommendations