Advertisement

Fully Dynamic Clustering of Metric Data Sets

  • Stefano Lodi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2405)

Abstract

The goal of cluster analysis [10] is to find homogeneous groups, or clusters, in data. Homogeneity is often made precise by means of a dissimilarity function on objects, having low values at pairs of objects in one cluster. Cluster analysis has also been investigated in data mining [5], emphasising efficiency on data sets larger than main memory [4,6,8,9,16]. More recently, the growing importance of multimedia and transactional databases has stimulated interest in metric clustering, i.e. when dissimilarity satisfies the triangular inequality.

Keywords

External Memory Dynamic Cluster Dynamic Algorithm Connectivity Query Euler Tour 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Yi-Jen Chiang, Michael T. Goodrich, Edward F. Grove, Roberto Tamassia, Darren Erik Vengroff, and Jeffrey Scott Vitter. External-memory graph algorithms. In Proceedings of the Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 139–149, San Francisco, California, 22–24 January 1995.Google Scholar
  2. 2.
    Paolo Ciaccia, Marco Patella, and Pavel Zezula. M-tree: An efficient access method for similarity search in metric spaces. In VLDB’97, Proceedings of 23rd International Conference on Very Large Data Bases, pages 426–435, 1997.Google Scholar
  3. 3.
    Martin Ester, Hans-Peter Kriegel, Jörg Sander, Michael Wimmer, and Xiaowei Xu. Incremental clustering for mining in a data warehousing environment. In Proc. 24th Int. Conf. Very Large Data Bases (VLDB), pages 323–333, 24–27 August 1998.Google Scholar
  4. 4.
    Martin Ester, Hans-Peter Kriegel, Jorg Sander, and Xiaowei Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Evangelos Simoudis, Jia Wei Han, and Usama Fayyad, editors, Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), page 226. AAAI Press, 1996.Google Scholar
  5. 5.
    Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, and Ramasamy Uthurusamy, editors. Advances in Knowledge Discovery and Data Mining. AAAI Press/MIT Press, March 1996.Google Scholar
  6. 6.
    Venkatesh Ganti, Raghu Ramakrishnan, Johannes Gehrke, Allison Powell, and James French. Clustering large datasets in arbitrary metric spaces. In Proc. 15th IEEE Conf. Data Engineering (ICDE), 23–26 March 1999.Google Scholar
  7. 7.
    K. Chidananda Gowda and G. Krishna. Agglomerative clustering using the concept of mutual nearest neighbourhood. Pattern Recognition, 10(2):105–112, 1978.zbMATHCrossRefGoogle Scholar
  8. 8.
    S. Guha, R. Rastogi, and K. Shim. CURE: An efficient clustering algorithm for large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD-98), volume 27,2 of ACM SIGMOD Record, pages 73–84, New York, June 1–4 1998. ACM Press.Google Scholar
  9. 9.
    Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim. ROCK: A robust clustering algorithm for categorical attributes. In Proceedings of the 15th International Conference on Data Engineering, 23–26 March 1999, Sydney, Austrialia, pages 512–521. IEEE Computer Society, 1999.Google Scholar
  10. 10.
    John A. Hartigan. Clustering algorithms. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons., New York, 1975.zbMATHGoogle Scholar
  11. 11.
    Jacob Holm, Kristian de Lichtenberg, and Mikkel Thorup. Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity. In Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing, pages 79–89, Dallas, Texas, USA, May 23–26 1998. ACM.Google Scholar
  12. 12.
    R. Jarvis and E. Patrick. Clustering using a similarity measure based on shared near neighbors. IEEE Transactions on Computers, 22(11):1025–1034, November 1973.Google Scholar
  13. 13.
    Peter Bro Miltersen, Sairam Subramanian, Jeffrey Scott Vitter, and Roberto Tamassia. Complexity models for incremental computation. Theoretical Computer Science, 130(1):203–236, August 1994.Google Scholar
  14. 14.
    Jeffrey Scott Vitter. External memory algorithms and data structures. In James Abello and Jeffrey Scott Vitter, editors, External Memory Algorithms and Visualization, DIMACS Series in Discrete Mathematics and Theoretical Computer Science. American Mathematical Society Press, Providence, RI, 1999.Google Scholar
  15. 15.
    M. Anthony Wong and Tom Lane. A kth nearest neighbour clustering procedure. J. R. Statist. Soc. B, 45(3):362–368, 1983.zbMATHMathSciNetGoogle Scholar
  16. 16.
    Tian Zhang, Raghu Ramakrishnan, and Miron Livny. BIRCH: An efficient data clustering method for very large databases. In H. V Jagadish and Inderpal Singh Mumick, editors, Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 103–114, Montreal, Quebec, Canada, 4–6 June 1996. SIGMOD Record 25(2), June 1996.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Stefano Lodi
    • 1
  1. 1.Department of Electronics, Computer Science, and SystemsUniversity of BolognaBolognaItaly

Personalised recommendations