# An Approximation Algorithm for a Problem of Partitioning a Sequence into Clusters with Constraints on Their Cardinalities

- 1 Downloads

## Abstract

We consider the problem of partitioning a finite sequence of points in Euclidean space into a given number of clusters (subsequences) minimizing the sum over all clusters of intracluster sums of squared distances of elements of the clusters to their centers. It is assumed that the center of one of the desired clusters is the origin, while the centers of the other clusters are unknown and are defined as the mean values of cluster elements. Additionally, there are a few structural constraints on the elements of the sequence that enter the clusters with unknown centers: (1) the concatenation of indices of elements of these clusters is an increasing sequence, (2) the difference between two consequent indices is lower and upper bounded by prescribed constants, and (3) the total number of elements in these clusters is given as an input. It is shown that the problem is strongly NP-hard. A 2-approximation algorithm that is polynomial for a fixed number of clusters is proposed for this problem.

### Keywords

partitioning sequence Euclidean space minimum sum of squared distances NP-hardness approximation algorithm.## Preview

Unable to display preview. Download preview PDF.

### References

- 1.Tak-chung Fu, “A review on time series data mining,” Eng. Appl. Artificial Intell.
**24**(1), 164–181 (2011).CrossRefGoogle Scholar - 2.
*Remote Sensing Time Series: Revealing Land Surface Dynamics*, Ed. by C. Kuenzer, S. Dech, and W. Wagner (Springer, New York, 2015), Ser. Remote Sensing and Digital Image Processing 22.Google Scholar - 3.T. Warren Liao, “Clustering of time series data—a survey,” Pattern Recogn.
**38**(11), 1857–1874 (2005).CrossRefMATHGoogle Scholar - 4.C. C. Aggarwal,
*Data Mining: The Textbook*(Springer, New York, 2015).CrossRefMATHGoogle Scholar - 5.A. V. Kel’manov and A. V. Pyatkin, “On complexity of some problems of cluster analysis of vector sequences,” J. Appl. Ind. Math.
**7**(3), 363–369 (2013).MathSciNetCrossRefMATHGoogle Scholar - 6.A. V. Kel’manov and S. A. Khamidullin, “An approximating polynomial algorithm for a sequence partitioning problem,” J. Appl. Ind. Math.
**8**(2), 53–66 (2014).MathSciNetMATHGoogle Scholar - 7.A. V. Kel’manov and L. V. Mikhailova, “Joint detection of a given number of reference fragments in a quasiperiodic sequence and its partition into segments containing series of identical fragments,” Comp. Math. Math. Phys.
**46**(1), 165–181 (2006).CrossRefMATHGoogle Scholar - 8.A. V. Kel’manov, S. A. Khamidullin, and V. I. Khandeev, “An exact pseudopolynomial algorithm for a sequence 2-cluster partitioning problem,” in
*Proceedings of the 15th All-Russia Conference on Mathematical Programming and Applications, Yekaterinburg, Russia, 2015*(IMM UrO RAN, Yekaterinburg, 2015), p. 139.Google Scholar - 9.A. V. Kel’manov, S. A. Khamidullin, and V. I. Khandeev, “A fully polynomial-time approximation scheme for a sequence 2-cluster partitioning problem,” J. Appl. Ind. Math.
**10**(2), 209–219 (2016).MathSciNetCrossRefMATHGoogle Scholar - 10.A. V. Kel’manov and S. M. Romanchenko, “An FPTAS for a vector subset search problem,” J. Appl. Ind. Math.
**8**(3), 329–336 (2012).CrossRefMATHGoogle Scholar