A linear programming based heuristic for a hard clustering problem on trees

  • Isabella Lari
  • Maurizio Maravalle
  • Bruno Simeone
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


Clustering problems with relational constraints in which the underlying graph is a tree arise in a variety of applications: hierarchical data base paging, communication and distribution network districting, biological taxonomy, and others. They are formulated here as optimal tree partitioning problems. In a previous paper, it was shown that their computational complexity strongly depends on the nature of the objective function and, in particular, that minimizing the total within-cluster dissimilarity or the diameter is computationally hard. We propose a heuristic which finds good partitions for the first problem within a reasonable time, even when its size is large. Such heuristic is based on the solution of a linear program and a maximal network flow one, and in any case it yields an explicit estimate of the relative approximation error. With minor variations a similar approach yields good solutions for the minimum diameter problem.

Key words

Contiguity-constrained clustering 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Ahuja, R.K. & Magnanti, T.L. & Orlin, J.B. (1993), Network flows, Prentice Hall, New Jersey.Google Scholar
  2. Delattre, M. & Hansen, P. (1980). Bicriterion cluster analysis, IEEE Trans, on Pattern Analysis and Machine Intelligence, 2, 277–291.CrossRefGoogle Scholar
  3. Ferligoj, A. & Batagelj, V. (1982). Clustering with relational constraints, Psychometrika, 47, 413–426.CrossRefGoogle Scholar
  4. Garey, M R & Johnson, D.S. (1979). Computers and intractability: a guide to the theory of NP-completeness, Freeman, S.Francisco.Google Scholar
  5. Hansen, P.& Jaumard, B. & Simeone, B. & Doring V. (1993). Maximum split clustering under connectivity constraints, Cahier G-93-06, GERAD, Montréal.Google Scholar
  6. Lari, I. & Maravalle, M. & Simeone B. (1998). Linear Programming based heuristics for two hard clustering problems on trees, Technical Report, Dip. di Statistica, Probabilità e Statistiche Applicate, n. 2, 1998.Google Scholar
  7. Lefkovitch, L.P. (1980). Conditional Clustering, Biometrics, 36, 43–58.CrossRefGoogle Scholar
  8. Lovász, L. (1979). Combinatorial Problems and Exercises, North Holland, Amsterdam.Google Scholar
  9. Maravalle, M. & Simeone, B. & Naldini, R. (1997) Clustering on trees, Computational Statistics and Data Analysis, 24, 217–234.CrossRefGoogle Scholar
  10. Marcotorchino, J.F. & Michaud, P. (1979). Optimisation en analyse des données, Masson.Google Scholar
  11. Mulvey, J.M. & Crowder, H P. (1979). Cluster analysis: an application of lagrangian relaxation, Management Science, 25, 329–340.CrossRefGoogle Scholar
  12. Murtagh F. (1985) A survey of algorithms for contiguity-constrained clustering and related problems, The Computer Journal, 28, 82–88.CrossRefGoogle Scholar
  13. Nemhauser, G.L. & Rinnoy Kan, A.H.G. & Todd M.J. (1989). Handbooks in Operations Research and Management Science, vol. I, Elsevier Science Publishers B.V, Amsterdam.Google Scholar
  14. Nemhauser, G.L. & Wolsey, L A. (1988). Integer and Combinatorial Optimization, Wiley, New York.Google Scholar
  15. Picard, J.C. (1976). Maximal closure of a graph and applications to combinatorial problems, Management Science, 22, 1268–1272.CrossRefGoogle Scholar
  16. Rao, M.R. (1971). Cluster analysis and mathematical programming, Journal of the American Statistical Association, 66, 622–626.CrossRefGoogle Scholar
  17. Tardos, E. (1986). A strongly polynomial algorithm to solve combinatorial linear programs, Operations Research, 34, 250–256.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin · Heidelberg 1998

Authors and Affiliations

  • Isabella Lari
    • 1
  • Maurizio Maravalle
    • 2
  • Bruno Simeone
    • 1
  1. 1.Department of Statistics“La Sapienza” UniversityRomeItaly
  2. 2.Department of Systems for EconomicsUniversity of L’AquilaL’AquilaItaly

Personalised recommendations