Skip to main content

A Fast Indexing Algorithm Optimization with User Behavior Pattern

  • Conference paper
Pervasive Computing and the Networked World (ICPCA/SWS 2012)

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 7719))

Abstract

Internet users’ access pattern for objects has been observed to follow Zipf’s law. The preference for network resource is showing strong influence on real-time lookup performance in large-scale distributed systems. In order to guarantee search response rate with limited memory space, we develop a new object indexing and locating algorithm called Bloom filter Arrays based on Zipf’s-distributed user Preference (ZPBA). The algorithm uses a compact data structure to achieve high accuracy in item lookup. We give the theoretical analysis of ZPBA and then conduct experiments with one million item corpus and 100,000 queries to validate our design. Comparison shows that our solution can be 77% more space efficient than traditional bloom filter based index approaches for applications of concentrated user access preference. The algorithm demonstrates practical application potential in fault tolerant large-scale distributed indexing and item lookup.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chierichetti, F., Kumar, R., Raghavan, P.: Compressed Web Indexes. In: 18th International Conference on World Wide Web, pp. 451–460. Association for Computing Machinery, New York (2009)

    Chapter  Google Scholar 

  2. Sato, I., Nakagawa, H.: Topic Models with Power-law using Pitman-Yor Process. In: 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 673–682. Association for Computing Machinery, New York (2010)

    Chapter  Google Scholar 

  3. Zipf, G.K.: The Psychobiology of Language. Houghton-Mifflin, Boston (1935)

    Google Scholar 

  4. Breslau, L., Pei, C., Li, F., Phillips, G., Shenker, S.: Web Caching and Zipf-like Distributions: Evidence and Implications. In: IEEE Annual Joint Conference of the IEEE Computer and Communications Societies, pp. 126–134. IEEE Press, New York (1999)

    Google Scholar 

  5. Rodriguez, P., Spanner, C., Biersack, E.W.: Analysis of Web Caching Architectures: Hierarchical and Distributed Caching. IEEE/ACM Transactions on Networking 9(4), 404–418 (2001)

    Article  Google Scholar 

  6. Kotera, I., Egawa, R., Takizawa, H., Kobayashi, H.: Modeling of Cache Access Behavior Based on Zipf’s Law. In: 9th Workshop on Memory Performance: Dealing with Applications, Systems and Architecture, pp. 9–15. Association for Computing Machinery, New York (2008)

    Chapter  Google Scholar 

  7. Jelenković, P.R., Kang, X.: Characterizing the Miss Sequence of the LRU Cache. ACM SIGMETRICS Performance Evaluation Review 36(2), 119–121 (2008)

    Article  Google Scholar 

  8. Zorich, V.A., Cooke, R.: Mathematical Analysis. Springer, Berlin (2004)

    MATH  Google Scholar 

  9. Ozsu, M.T., Valduriez, P.: Principles of Distributed Database Systems, 3rd edn. Springer, Berlin (2011)

    Google Scholar 

  10. Stoica, I., Morris, R., Karger, D., Kaashoek, M., Balakrishnan, H.: Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. In: 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 149–160. Association for Computing Machinery, New York (2001)

    Google Scholar 

  11. Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A Scalable Content-Addressable Network. In: 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 161–172. Association for Computing Machinery, New York (2001)

    Google Scholar 

  12. Weil, S.A., Pollack, K.T., Brandt, S.A., Miller, E.L.: Dynamic Metadata Management for Petabyte-Scale File Systems. In: 2004 ACM/IEEE Conference on Supercomputing, p. 4. IEEE Computer Society, Washington, DC (2004)

    Google Scholar 

  13. Bloom, B.: Space/time Trade-offs in Hash Coding with Allowable Errors. Communications of the ACM 13(7), 422–426 (1970)

    Article  MATH  Google Scholar 

  14. Tarkoma, S.C., Rothenberg, E., Lagerspetz, E.: Theory and Practice of Bloom Filters for Distributed Systems. IEEE Communications Surveys & Tutorials 14(1), 131–155 (2004)

    Article  Google Scholar 

  15. Wang, Z., Luo, T.-J.: Intelligent Video Content Routing in a Direct Access Network. In: 3rd Symposium on Web Society, pp. 147–152. IEEE Press, New York (2011)

    Google Scholar 

  16. Fan, L., Cao, P., Almeida, J., Broder, A.Z.: Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. IEEE/ACM Transactions on Networking 8(3), 281–293 (2000)

    Article  Google Scholar 

  17. Bonomi, F., Mitzenmacher, M., Panigrahy, R., Singh, S., Varghese, G.: An Improved Construction for Counting Bloom Filters. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 684–695. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  18. Ficara, D., Giordano, S., Procissi, G., Vitucci, F.: Multilayer Compressed Counting Bloom Filters. In: 27th Annual Joint Conference of the IEEE Computer and Communications Societies, pp. 311–315. IEEE Press, New York (2008)

    Google Scholar 

  19. Song, H., Dharmapurikar, S., Turner, J., Lockwood, J.: Fast Hash Table Lookup using Extended Bloom Filter: an Aid to Network Processing. In: 2005 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 181–192. Association for Computing Machinery, New York (2005)

    Google Scholar 

  20. Bonomi, F., Mitzenmacher, M., Panigrah, R., Singh, S., Varghese, G.: Beyond Bloom Filters: from Approximate Membership Checks to Approximate State Machines. In: 2006 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 315–326. Association for Computing Machinery, New York (2006)

    Google Scholar 

  21. Sung, M., Xu, J., Li, J., Li, L.: Large-scale IP Traceback in Highspeed Internet: Practical Techniques and Information-theoretic Foundation. IEEE/ACM Transaction on Networking 16(6), 1253–1266 (2008)

    Article  Google Scholar 

  22. Jokela, P., Zahemszky, A., Esteve, C., Arianfar, S., Nikander, P.: LIPSIN: Line Speed Publish/Subscribe Inter-Networking. In: 2009 ACM SIGCOMM 2009 Conference on Data Communication, pp. 195–206. Association for Computing Machinery, New York (2009)

    Google Scholar 

  23. Zhu, Y., Jiang, H., Wang, J., Xian, F.: HBA: Distributed Metadata Management for Large Cluster-Based Storage Systems. IEEE Transactions on Parallel and Distributed Systems 19(6), 750–763 (2008)

    Article  Google Scholar 

  24. Wang, Z., Luo, T.: Optimizing Hash Function Number for BF-Based Object Locating Algorithm. In: Tan, Y., Shi, Y., Ji, Z. (eds.) ICSI 2012, Part II. LNCS, vol. 7332, pp. 543–552. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  25. Bruck, J., Gao, J., Jiang, A.: Weighted Bloom Filter. In: 2006 IEEE International Symposium on Information Theory, pp. 2304–2308. IEEE Press, New York (2006)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, Z., Luo, T., Xu, Y., Cheng, F., Zhang, X., Wang, X. (2013). A Fast Indexing Algorithm Optimization with User Behavior Pattern. In: Zu, Q., Hu, B., Elçi, A. (eds) Pervasive Computing and the Networked World. ICPCA/SWS 2012. Lecture Notes in Computer Science, vol 7719. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37015-1_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37015-1_52

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37014-4

  • Online ISBN: 978-3-642-37015-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics