Advertisement

Efficient Context-Aware K-Nearest Neighbor Search

  • Mostafa Haghir Chehreghani
  • Morteza Haghir ChehreghaniEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10772)

Abstract

We develop a context-sensitive and linear-time K-nearest neighbor search method, wherein the test object and its neighborhood (in the training dataset) are required to share a similar structure via establishing bilateral relations. Our approach particularly enables to deal with two types of irregularities: (i) when the (test) objects are outliers, i.e. they do not belong to any of the existing structures in the (training) dataset, and (ii) when the structures (e.g. classes) in the dataset have diverse densities. Instead of aiming to capture the correct underlying structure of the whole data, we extract the correct structure in the neighborhood of the test object, which leads to computational efficiency of our search strategy. We investigate the performance of our method on a variety of real-world datasets and demonstrate its superior performance compared to the alternatives.

References

  1. 1.
    Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)CrossRefzbMATHGoogle Scholar
  2. 2.
    Brito, M.R., Chávez, E.L., Quiroz, A.J., Yukich, J.E.: Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection. Stat. Probab. Lett. 35(1), 33–42 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Chebotarev, P.: A class of graph-geodetic distances generalizing the shortest-path and the resistance distances. Discrete Appl. Math. 159(5), 295–302 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Chehreghani, M.H.: K-nearest neighbor search and outlier detection via minimax distances. In: Proceedings of the 2016 SIAM International Conference on Data Mining, pp. 405–413 (2016)Google Scholar
  5. 5.
    Domeniconi, C., Peng, J., Gunopulos, D.: Locally adaptive metric nearest-neighbor classification. IEEE Trans. Pattern Anal. Mach. Intell. 24(9), 1281–1285 (2002)CrossRefGoogle Scholar
  6. 6.
    Fischer, B., Buhmann, J.M.: Path-based clustering for grouping of smooth curves and texture segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 25(4), 513–518 (2003)CrossRefGoogle Scholar
  7. 7.
    Fouss, F., Francoisse, K., Yen, L., Pirotte, A., Saerens, M.: An experimental investigation of kernels on graphs for collaborative recommendation and semisupervised classification. Neural Netw. 31, 53–72 (2012)CrossRefzbMATHGoogle Scholar
  8. 8.
    Fouss, F., Pirotte, A., Renders, J.-M., Saerens, M.: Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans. Knowl. Data Eng. 19(3), 355–369 (2007)CrossRefGoogle Scholar
  9. 9.
    Hautamaki, V., Karkkainen, I., Franti, P.: Outlier detection using k-nearest neighbour graph. In: 17th International Conference on Proceedings of the Pattern Recognition, (ICPR 2004), pp. 430–433 (2004)Google Scholar
  10. 10.
    Hawkins, D.M.: Identification of Outliers. Monographs on Applied Probability and Statistics. Chapman and Hall, London (1980)CrossRefGoogle Scholar
  11. 11.
    Kim, K.-H., Choi, S.: Neighbor search with global geometry: a minimax message passing algorithm. In: ICML, pp. 401–408 (2007)Google Scholar
  12. 12.
    Kim, K.-H., Choi, S.: Walking on minimax paths for k-nn search. In: AAAI (2013)Google Scholar
  13. 13.
    Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Maier, M., Hein, M., von Luxburg, U.: Optimal construction of k-nearest-neighbor graphs for identifying noisy clusters. Theor. Comput. Sci. 410(19), 1749–1764 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Mitchell, T.M.: Machine Learning, 1st edn. McGraw-Hill Inc., New York (1997)zbMATHGoogle Scholar
  16. 16.
    Musser, D.: Introspective sorting and selection algorithms. Softw. Pract. Experience 27, 983–993 (1997)CrossRefGoogle Scholar
  17. 17.
    Nadler, B., Galun, M.: Fundamental limitations of spectral clustering. In: Advanced in Neural Information Processing Systems, vol. 19, pp. 1017–1024 (2007)Google Scholar
  18. 18.
    Parvin, H., Alizadeh, H., Minaei-Bidgoli, B.: MKNN: modified k-nearest neighborGoogle Scholar
  19. 19.
    Radovanovic, M., Nanopoulos, A., Ivanovic, M.: Hubs in space: popular nearest neighbors in high-dimensional data. J. Mach. Learn. Res. 11, 2487–2531 (2010)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Radovanovic, M., Nanopoulos, A., Ivanovic, M.: Reverse nearest neighbors in unsupervised distance-based outlier detection. IEEE Trans. Knowl. Data Eng. 27(5), 1369–1382 (2015)CrossRefGoogle Scholar
  21. 21.
    Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)CrossRefGoogle Scholar
  22. 22.
    Song, Y., Huang, J., Zhou, D., Zha, H., Giles, C.L.: IKNN: informative k-nearest neighbor pattern classification. In: 11th European Conference on Principles and Practice of Knowledge Discovery PKDD, pp. 248–264 (2007)Google Scholar
  23. 23.
    Tang, B., He, H.: ENN: extended nearest neighbor method for pattern recognition [research frontier]. IEEE Comp. Int. Mag. 10(3), 52–60 (2015)CrossRefGoogle Scholar
  24. 24.
    Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319 (2000)CrossRefGoogle Scholar
  25. 25.
    von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Wang, F., Zhang, C.: Label propagation through linear neighborhoods. IEEE Trans. Knowl. Data Eng. 20(1), 55–67 (2008)CrossRefGoogle Scholar
  27. 27.
    Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)zbMATHGoogle Scholar
  28. 28.
    Xing, E.P., Jordan, M.I., Russell, S.J., Ng, A.Y.: Distance metric learning with application to clustering with side-information. In: Advances in Neural Information Processing Systems, vol. 15, pp. 521–528. MIT Press (2003)Google Scholar
  29. 29.
    Zhang, Z., Zha, H.: Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J. Sci. Comput. 26, 313–338 (2002)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Mostafa Haghir Chehreghani
    • 1
  • Morteza Haghir Chehreghani
    • 2
    Email author
  1. 1.Telecom ParisTechParisFrance
  2. 2.Chalmers University of TechnologyGothenburgSweden

Personalised recommendations