Algorithms and Hardness Results for Nearest Neighbor Problems in Bicolored Point Sets

  • Sandip Banerjee
  • Sujoy Bhore
  • Rajesh Chitnis
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10807)


In the context of computational supervised learning, the primary objective is the classification of data. Especially, the goal is to provide the system with “training” data and design a method which uses the training data to classify new objects with the correct label. A standard scenario is that the examples are points from a metric space, and “nearby” points should have “similar” labels. In practice, it is desirable to reduce the size of the training set without compromising too much on the ability to correctly label new objects. Such subsets of the training data are called as edited sets. Wilfong [SOCG ’91] defined two types of edited subsets: consistent subsets (those which correctly label all objects from the training data) and selective subsets (those which correctly label all new objects the same way as the original training data). This leads to the following two optimization problems:
  • Open image in new window : Given k sets of points \(P_1, P_2, \ldots , P_k\) in a metric space \(\mathcal X\), the goal is to choose subsets of points \(P'_i \subseteq P_i\) for \(i=1,2,\ldots ,k\) such that \(\forall \ p \in P_i\) its nearest neighbor among \(\bigcup _{j=1}^{k} P'_j \) lies in \(P'_i\) for each \(i\in [k]\) while minimizing (Note that we also enforce the condition \(|P'_i|\ge 1\ \forall \ i\in [k]\).) the quantity \(\sum _{i=1}^k |P'_i|\).

  • Open image in new window : Given k sets of points \(P_1, P_2, \ldots , P_k\) in a metric space \(\mathcal X\), the goal is to choose subsets of points \(P'_i \subseteq P_i\) for \(i=1,2,\ldots ,k\) such that \(\forall \ p \in P_i\) its nearest neighbor among \(\Big (\bigcup _{j=1, j\ne i}^{k} P_j\Big ) \cup P'_i \) lies in \(P'_i\) for each \(i\in [k]\) while minimizing (Note that we again enforce the condition \(|P'_i|\ge 1\ \forall \ i\in [k]\).) the quantity \(\sum _{i=1}^k |P'_i|\).

While there have been several heuristics proposed for these two problems in the computer vision and machine learning community, the only theoretical results for these problems (to the best of our knowledge) are due to Wilfong [SOCG ’91] who showed that both 3-MCS-(\(\mathbb {R}^2\)) and 2-MSS-(\(\mathbb {R}^2\)) are NP-complete. We initiate the study of these two problems from a theoretical perspective, and obtain several algorithmic and hardness results.

On the algorithmic side, we first design an \(O(n^2)\) time exact algorithm and \(O(n\log n)\) time 2-approximation for the 2-MCS-(\(\mathbb {R}\)) problem, i.e., the points are located on the real line. Moreover, we show that the exact algorithm also extends to the case when the points are located on the circumference of a circle. Next, we design an \(O(r^2)\) time online algorithm for the 2-MCS-(\(\mathbb {R}\)) problem such that \(r<n\), where n is the set of points and r is an integer. Finally, we give a PTAS for the k-MSS-(\(\mathbb {R}^2\)) problem. On the hardness side, we show that both the 2-MCS and 2-MSS problems are NP-complete on graphs. Additionally, the problems are W[2]-hard parameterized by the size k of the solution. For points on the Euclidean plane, we show that the 2-MSS problem is contained in W[1]. Finally, we show a lower bound of \(\varOmega (\sqrt{n})\) bits for the storage of any (randomized) algorithm which solves both 2-MCS-(\(\mathbb {R}\)) and 2-MSS-(\(\mathbb {R}\)).


  1. 1.
    Lokshtanov, D., Marx, D., Saurabh, S.: Lower bounds based on the exponential time hypothesis. Bull. EATCS 105, 41–72 (2011)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Kushilevitz, E., Nisan, N.: Communication Compelxity. Cambridge University Press, Cambridge (1997)zbMATHGoogle Scholar
  3. 3.
    Wilfong, G.T.: Nearest neighbor problems. Int. J. Comput. Geom. Appl. 2(4), 383–416 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Levinson, S.E.: Structural methods in automated speech recognition. Proc. IEEE 73(11), 1625–1650 (1985)CrossRefGoogle Scholar
  5. 5.
    Hart, P.E.: The condensed nearest neighbor rule. IEEE Trans. Inf. Theory 14(3), 515–516 (1968)CrossRefGoogle Scholar
  6. 6.
    Tappert, C.C., Suen, C.Y., Wakahara, T.: The state of the art in online handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 12(8), 787–808 (1990)CrossRefGoogle Scholar
  7. 7.
    Gates, G.W.: The reduced nearest neighbour rule. IEEE Trans. Inf. Theory 18(3), 431–433 (1972)CrossRefGoogle Scholar
  8. 8.
    Masuyama, S., Ibaraki, T., Hasegawa, T.: The computational complexity of the m-center problems in the plane. IEEE Trans. IECE Jpn. 64(2), 57–64 (1981)Google Scholar
  9. 9.
    Agarwal, P., Pach, J., Sharir, M.: State of the union-of geometric objects. In: Godman, J., Pach, J., Pollack, R. (eds.) Surveys in Discrete and Computational Geometry Twenty Years Later. Contemporary Mathematics, vol. 453, pp. 9–48 (2008)Google Scholar
  10. 10.
    Mustafa, N.H., Ray, S.: PTAS for geometric hitting set problem. In: Proceedings of the 27th(ACM) Symposium on Computational Geometry, pp. 17–22 (2009)Google Scholar
  11. 11.
    Flum, J., Grohe, M.: Parameterized Complexity Theory, Texts in Theoretical Computer Science. An EATCS Series. Springer, Heidelberg (2006). zbMATHGoogle Scholar
  12. 12.
    Hitter, G.L., Woodruff, H.B., Lowry, S.R., Isenhour, T.L.: An algorithm for a selective nearest neighbor rule. IEEE Trans. Inf. Theory 21, 665–669 (1975)CrossRefzbMATHGoogle Scholar
  13. 13.
    Agarwal, P.K., Sharir, M.: Red-blue intersection detection algorithms, with applications to motion planning and collision detection. SIAM J. Comput. 19(2), 297–321 (1990)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Arkin, E.M., Daz-Bez, J.M., Hurtado, F., Kumar, P., Mitchell, J.S.B., Palop, B., Prez-Lantero, P., Saumell, M., Silveira, R.I.: Bichromatic 2-center of pairs of points. Comput. Geom. 48(2), 94–107 (2015)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Indian Statistical InstituteKolkataIndia
  2. 2.Ben-Gurion University of the NegevBeershebaIsrael
  3. 3.Department of Computer ScienceUniversity of WarwickCoventryUK

Personalised recommendations