Skip to main content

Locality Preserving Scheme of Text Databases Representative in Distributed Information Retrieval Systems

  • Conference paper
Book cover Networked Digital Technologies (NDT 2010)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 88))

Included in the following conference series:

  • 1152 Accesses

Abstract

This paper proposes an efficient and effective "Locality Preserving Mapping" scheme that allows text databases representatives to be mapped onto a global information retrieval system such as Peer-to-Peer Information Retrieval Systems (P2PIR). The proposed approach depends on using Locality Sensitive Hash functions (LSH), and approximate min-wise independent permutations to achieve such task. Experimental evaluation over real data, along with comparison between different proposed schemes (with different parameters) will be presented in order to show the performance advantages of such schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bawa, M., Condie, T., Ganesan, P.: LSH forest: self tuning indexes for similarity search. In: Proceedings of the 14th International on World Wide Web (WWW 2005), New York, NY, USA, pp. 651–660 (2005)

    Google Scholar 

  2. Bhattacharya, I., Kashyap, S.R., Parthasarathy, S.: Similarity Searching in Peer-to-Peer Databases. In: The 25th IEEE International Conference on Distributed Computing Systems (ICDCS 2005), Columbus, OH, pp. 329–338 (2005)

    Google Scholar 

  3. Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: The 34th ACM Symposium on Theory of Computing, pp. 380–388 (2002)

    Google Scholar 

  4. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. In: The Twentieth Annual Symposium on Computational Geometry (SCG 2004), Brooklyn, New York, USA, pp. 253–262 (2004)

    Google Scholar 

  5. Gupta, A., Agrawal, D., Abbadi, A.E.: Approximate Range Selection Queries in Peer-to-Peer Systems. In: The CIDR Conference, pp. 254–273 (2003)

    Google Scholar 

  6. Cai, D., He, X., Han, J.: Document Clustering Using Locality Preserving Indexing. IEEE Transactions on Knowledge and Data Engineering 17(12), 1624–1637 (2005)

    Article  Google Scholar 

  7. Mokbel, M.F., Aref, W.G., Grama, A.: Spectral LPM: An Optimal Locality-Preserving Mapping using the Spectral (not Fractal) Order. In: The 19th International Conference on Data Engineering (ICDE 2003), pp. 699–701 (2003)

    Google Scholar 

  8. Sagan, H.: Space-Filling Curves. Springer, Berlin (1994)

    MATH  Google Scholar 

  9. Indyk, P., Motwani, R.: Approximate Nearest Neighbors: towards Removing the Curse of Dimensionality. In: The Symp. Theory of Computing, pp. 604–613 (1998)

    Google Scholar 

  10. Indyk, P.: Nearest neighbors in High-Dimensional Spaces. In: CRC Handbook of Discrete and Computational Geometry. CRC, Boca Raton (2003)

    Google Scholar 

  11. Motwani, R., Naor, A., Panigrahy, R.: Lower bounds on Locality Sensitive Hashing. In: The ACM Twenty-Second Annual Symposium on Computational Geometry SCG 2006, Sedona, Arizona, USA, pp. 154–157 (2006)

    Google Scholar 

  12. Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Efficient Filtering with Sketches in the Ferret Toolkit. In: The 8th ACM International Workshop on Multimedia Information Retrieval (MIR 2006), Santa Barbara, California, USA, pp. 279–288 (2006)

    Google Scholar 

  13. Qamra, A., Meng, Y., Chang, E.Y.: Enhanced Perceptual Distance Functions and Indexing for Image Replica Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(3), 379–391 (2005)

    Article  Google Scholar 

  14. Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-Wise Independent Permutations. Journal of Computer and System Sciences 60, 630–699 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  15. Broder, A.Z.: On the resemblance and containment of documents. In: Proceedings of Compression and Complexity of Sequences, p. 21 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hassan, M., Hasan, Y. (2010). Locality Preserving Scheme of Text Databases Representative in Distributed Information Retrieval Systems. In: Zavoral, F., Yaghob, J., Pichappan, P., El-Qawasmeh, E. (eds) Networked Digital Technologies. NDT 2010. Communications in Computer and Information Science, vol 88. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14306-9_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14306-9_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14305-2

  • Online ISBN: 978-3-642-14306-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics