Abstract
We consider the problem of partitioning a finite sequence of points in Euclidean space into a given number of clusters (subsequences) minimizing the sum over all clusters of intracluster sums of squared distances of elements of the clusters to their centers. It is assumed that the center of one of the desired clusters is the origin, while the centers of the other clusters are unknown and are defined as the mean values of cluster elements. Additionally, there are a few structural constraints on the elements of the sequence that enter the clusters with unknown centers: (1) the concatenation of indices of elements of these clusters is an increasing sequence, (2) the difference between two consequent indices is lower and upper bounded by prescribed constants, and (3) the total number of elements in these clusters is given as an input. It is shown that the problem is strongly NP-hard. A 2-approximation algorithm that is polynomial for a fixed number of clusters is proposed for this problem.
Similar content being viewed by others
References
Tak-chung Fu, “A review on time series data mining,” Eng. Appl. Artificial Intell. 24 (1), 164–181 (2011).
Remote Sensing Time Series: Revealing Land Surface Dynamics, Ed. by C. Kuenzer, S. Dech, and W. Wagner (Springer, New York, 2015), Ser. Remote Sensing and Digital Image Processing 22.
T. Warren Liao, “Clustering of time series data—a survey,” Pattern Recogn. 38 (11), 1857–1874 (2005).
C. C. Aggarwal, Data Mining: The Textbook (Springer, New York, 2015).
A. V. Kel’manov and A. V. Pyatkin, “On complexity of some problems of cluster analysis of vector sequences,” J. Appl. Ind. Math. 7 (3), 363–369 (2013).
A. V. Kel’manov and S. A. Khamidullin, “An approximating polynomial algorithm for a sequence partitioning problem,” J. Appl. Ind. Math. 8 (2), 53–66 (2014).
A. V. Kel’manov and L. V. Mikhailova, “Joint detection of a given number of reference fragments in a quasiperiodic sequence and its partition into segments containing series of identical fragments,” Comp. Math. Math. Phys. 46 (1), 165–181 (2006).
A. V. Kel’manov, S. A. Khamidullin, and V. I. Khandeev, “An exact pseudopolynomial algorithm for a sequence 2-cluster partitioning problem,” in Proceedings of the 15th All-Russia Conference on Mathematical Programming and Applications, Yekaterinburg, Russia, 2015 (IMM UrO RAN, Yekaterinburg, 2015), p. 139.
A. V. Kel’manov, S. A. Khamidullin, and V. I. Khandeev, “A fully polynomial-time approximation scheme for a sequence 2-cluster partitioning problem,” J. Appl. Ind. Math. 10 (2), 209–219 (2016).
A. V. Kel’manov and S. M. Romanchenko, “An FPTAS for a vector subset search problem,” J. Appl. Ind. Math. 8 (3), 329–336 (2012).
Author information
Authors and Affiliations
Corresponding author
Additional information
Original Russian Text © A.V. Kel’manov, L.V.Mikhailova, S.A.Khamidullin, V.I. Khandeev, 2016, published in Trudy Instituta Matematiki i Mekhaniki UrO RAN, 2016, Vol. 22, No. 3.
Rights and permissions
About this article
Cite this article
Kel’manov, A.V., Mikhailova, L.V., Khamidullin, S.A. et al. An Approximation Algorithm for a Problem of Partitioning a Sequence into Clusters with Constraints on Their Cardinalities. Proc. Steklov Inst. Math. 299 (Suppl 1), 88–96 (2017). https://doi.org/10.1134/S0081543817090115
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0081543817090115