Permutation-Based Pruning for Approximate K-NN Search

  • Hisham Mohamed
  • Stéphane Marchand-Maillet
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8055)


In this paper, we propose an effective indexing and search algorithms for approximate K-NN based on an enhanced implementation of the Metric Suffix Array and Permutation-Based Indexing. Our main contribution is to propose a sound scalable strategy to prune objects based on the location of the reference objects in the query ordered lists. We study the performance and efficiency of our algorithms on large-scale dataset of millions of documents. Experimental results show a decrease of computational time while preserving the quality of the results.


Metric Suffix Array (MSA) Permutation-Based Indexing Approximate Similarity Search Large-Scale Multimedia Indexing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach. Advances in Database Systems, vol. 32. Springer (2006)Google Scholar
  2. 2.
    Gonzalez, E., Figueroa, K., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(9) (September 2008)Google Scholar
  3. 3.
    Amato, G., Savino, P.: Approximate similarity search in metric spaces using inverted files. In: International Conference on Scalable Information Systems, pp. 28:1–28:10 (2008)Google Scholar
  4. 4.
    Mohamed, H., Marchand-Maillet, S.: Metric suffix array for large-scale similarity search. In: ACM WSDM 2013 Workshop on Large Scale and Distributed Systems for Information Retrieval, Rome, IT (February 2013)Google Scholar
  5. 5.
    Mohamed, H., Marchand-Maillet, S.: Parallel approaches to permutation-based indexing using inverted files. In: 5th International Conference on Similarity Search and Applications (SISAP), Toronto, CA (August 2012)Google Scholar
  6. 6.
    Téllez, E.S., Chávez, E., Camarena-Ibarrola, A.: A brief index for proximity searching. In: Bayro-Corrochano, E., Eklundh, J.-O. (eds.) CIARP 2009. LNCS, vol. 5856, pp. 529–536. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  7. 7.
    Esuli, A.: Pp-index: Using permutation prefixes for efficient and scalable approximate similarity search. In: Proceedings of LSDSIR 2009, vol. i, pp. 1–48 (July 2009)Google Scholar
  8. 8.
    Manber, U., Myers, E.W.: Suffix arrays: A new method for on-line string searches. SIAM J. Comput. 22(5), 935–948 (1993)MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    Schürmann, K.B., Stoye, J.: An incomplex algorithm for fast suffix array construction. Softw., Pract. Exper. 37(3), 309–329 (2007)CrossRefGoogle Scholar
  10. 10.
    Mohamed, H., Abouelhoda, M.: Parallel suffix sorting based on bucket pointer refinement. In: 5th Cairo International Biomedical Engineering Conference (CIBEC), pp. 98–102 (2010)Google Scholar
  11. 11.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR 2009 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Hisham Mohamed
    • 1
  • Stéphane Marchand-Maillet
    • 1
  1. 1.Université de GenèveGenevaSwitzerland

Personalised recommendations