Advertisement

Partition-based clustering in object bases: From theory to practice

  • Carsten Gerlhof
  • Alfons Kemper
  • Christoph Kilger
  • Guido Moerkotte
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 730)

Abstract

We classify clustering algorithms into sequence-based techniques—which transform the object net into a linear sequence—and partition-based clustering algorithms. Tsangaris and Naughton [TN91, TN92] have shown that the partition-based techniques are superior. However, their work is based on a single partitioning algorithm, the Kernighan and Lin heuristics, which is not applicable to realistically large object bases because of its high running-time complexity. The contribution of this paper is two-fold: (1) we devise a new class of greedy object graph partitioning algorithms (GGP) whose running-time complexity is moderate while still yielding good quality results. (2) Our extensive quantitative analysis of all well-known partitioning algorithms indicates that no one algorithm performs superior for all object net characteristics. Therefore, we propose an adaptable clustering strategy according to a multi-dimensional grid: the dimensions correspond to particular characteristics of the object base—given by, e.g., number and size of objects, degree of object sharing—and the grid entries indicate the most suitable clustering algorithm for the particular configuration.

Keywords

Cluster Algorithm Object Size Object Base External Cost Cluster Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [Ben90]
    V. Benzaken. An evaluation model for clustering strategies in the O2 object-oriented database system. In Proc. of the Int. Conf. on Database Theory (ICDT), pages 126–140. Springer-Verlag, 1990.Google Scholar
  2. [BKKG88]
    J. Banerjee, W. Kim, S. J. Kim, and J. F. Garza. Clustering a DAG for CAD databases. IEEE Trans. Software Eng., 14(11):1684–1699, Nov 1988.Google Scholar
  3. [Bre77]
    M. A. Breuer. A class of min-cut placement algorithms. In Proc. of the Design Automation Conference, pages 284–290, 1977.Google Scholar
  4. [CDRS86]
    M. Carey, D. DeWitt, J. Richardson, and E. Shekita. Object and file management in the EXODUS extensible database system. In Proc. of the Conf. on Very Large Data Bases (VLDB), pages 91–100, Kyoto, Japan, Aug 86.Google Scholar
  5. [FM82]
    C. Fidducia and R. M. Mattheyses. A linear-time heuristic for improving network partitions. In Proc. of the Design Automation Conference, pages 175–181, 1982.Google Scholar
  6. [GKKM92a]
    C. Gerlhof, A. Kemper, C. Kilger, and G. Moerkotte. Clustering in object bases. Technical Report 6/92, Fakultät für Informatik, Universität Karlsruhe, D-7500 Karlsruhe, Jun 1992.Google Scholar
  7. [GKKM92b]
    C. Gerlhof, A. Kemper, C. Kilger, and G. Moerkotte. Partition-based clustering in object bases: From theory to practice. Technical Report 92-34, RWTH Aachen, D-5100 Aachen, Dec 1992.Google Scholar
  8. [HK89]
    S. E. Hudson and R. King. Cactis: A self-adaptive, concurrent implementation of an object-oriented database management system. ACM Trans. on Database Systems, 14(3):291–321, Sep 1989.Google Scholar
  9. [HZ87]
    M. Hornick and S. Zdonik. A shared, segmented memory system for an object-oriented database. ACM Trans. Office Inf. Syst., 5(1):70–95, Jan 1987.Google Scholar
  10. [KL70]
    B. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphs. Bell System Technical Journal, 49(2):291–307, Feb 1970.Google Scholar
  11. [Kri84]
    B. Krishnamurthy. An improved min-cut algorithm for partitioning VLSI networks. IEEE Trans. on Comp., 33(5):438–446, 1984.Google Scholar
  12. [Kru56]
    J. B. Kruskal. On the shortest spanning subgraph of a graph and the travelling salesman problem. Proc. of the Amer. Math. Soc., 7:48–50, 1956.Google Scholar
  13. [Luk74]
    J. Lukes. Efficient algorithm for the partitioning of trees. IBM Journal of Research and Development, 18:217–224, 1974.Google Scholar
  14. [SM91]
    K. Shahookar and P. Mazumber. VLSI cell placement techniques. ACM Computing Surveys, 23(2):143–220, Jun 1991.Google Scholar
  15. [Sta84]
    J. Stamos. Static grouping of small objects to enhance performance of a paged virtual memory. ACM Trans. Comp. Syst., 2(2):155–180, May 1984.Google Scholar
  16. [TN91]
    M. M. Tsangaris and J. F. Naughton. A stochastic approach for clustering in object bases. In Proc. of the ACM SIGMOD Conf. on Management of Data, pages 12–21, Denver, CO, May 1991.Google Scholar
  17. [TN92]
    M. M. Tsangaris and J. F. Naughton. On the performance of object clustering techniques. In Proc. of the ACM SIGMOD Conf. on Management of Data, pages 144–153, San Diego, CA, Jun 1992.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1993

Authors and Affiliations

  • Carsten Gerlhof
    • 1
  • Alfons Kemper
    • 1
  • Christoph Kilger
    • 2
  • Guido Moerkotte
    • 2
  1. 1.Lehrstuhl für Dialogorientierte Systeme, Fakultät für Mathematik und InformatikUniversität PassauPassauGermany
  2. 2.Fakultät für InformatikUniversität KarlsruheKarlsruheGermany

Personalised recommendations