Skip to main content

Parallel Mining of Frequent Itemsets from Memory-Mapped Files

  • Conference paper
  • First Online:
  • 3925 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 654))

Abstract

Due to digitization of data in different fields, data are increasing in leaps and bounds. Mining of these large amounts of data requires two major issues to deal with. The first is the potential to deal with huge data which can be dealt with parallel algorithms as serial algorithms may take very long time or sometimes may not process. The second is the I/O overhead which can be dealt with memory mapping of files. This chapter brings together both parallelization and memory mapping of files concepts in mining the frequent itemsets. Our experiments proved that there is almost 20% more speedup on parallelizing our frequent itemset mining algorithm with memory mapping when compared to conventional I/O without memory mapping.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94), Santiago, Chile, vol. 1215, pp. 487–499 (1994)

    Google Scholar 

  2. Sivanandam, S.N., Sumathi, S.: Data Mining Concepts, Tasks. Techniques Thomson Business Information Pvt. Ltd., India (2006)

    Google Scholar 

  3. Radha, T.A., Lavanya, P.: Recent trends in parallel and distributed apriori algorithm. Int. J. Eng. Res. Appl. 1(4), 1820–1822 (2000)

    Google Scholar 

  4. Anuradha, T., Kranthi, M., Saragini, M.: Recent trends in parallel classification and clustering data mining. Glob. J. Comput. Appl. Technol. 1(4), 617–619 (2011)

    Google Scholar 

  5. Barbic, J.: Multi-core architectures—Lecture Notes [Online]. Available http://www.co-array.org/cafvsmpi.Htm (2007)

  6. Akhter, S., Roberts, J.: Multi-core Programming, vol. 33. Intel Press (2006)

    Google Scholar 

  7. Packirisamy, V., Barathvajasankar, H.: Openmp in Multicore Architectures. University of Minnesota, Tech, Rep (2005)

    Google Scholar 

  8. Vu, Lan, Alaghband, Gita: Novel parallel method for association rule mining on multi-core shared memory systems. Parallel Comput. 40(10), 768–785 (2014)

    Article  Google Scholar 

  9. Heidemann, J.: Performance interactions between P-HTTP and TCP implementations. ACM SIGCOMM Comput. Commun. Rev. 27(2), 65–73 (1997)

    Article  Google Scholar 

  10. Tevanian, A., Rashid, R.F., Young, M., Golub, D.B., Thompson, M.R., Bolosky, W.J., Sanzi, R.: A UNIX interface for shared memory and memory mapped files under mach. In: USENIX Summer, pp. 53–68 (1987)

    Google Scholar 

  11. Love, R.: Linux System Programming, 2nd edn. O’Reilly Media, Inc (2007)

    Google Scholar 

  12. Anuradha, T., Satya Prasad, R., Tirumalarao, S.N.: Parallelizing apriori on dual core using OpenMP. Int. J. Comput. Appl. 43(24), 33–39 (2012)

    Google Scholar 

  13. Anuradha, T., Satya Prasad, R., Tirumalarao, S.N.: Performance evaluation of apriori on dual core with multiple threads. Int. J. Comput. Appl. 50(16), 9–16 (2012)

    Google Scholar 

  14. Anuradha, T., Satya Prasad, R., Tirumala Rao, S.N.: Performance evaluation of apriori with memory mapped files. Int. J. Comput. Sci. Issues 10 1(1), 162–169 (2013)

    Google Scholar 

  15. Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann (2006)

    Google Scholar 

  16. Bodon, F.: A fast apriori implementation. In: Goethals, B., Zaki, M.J. (Eds.) Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI’03), pp. 1–10 (2003)

    Google Scholar 

  17. Tirumala Rao, S.N., Prasad, E.V., Venkateswrlu, N.B.: A critical performance study of memory mapping on multi-core processors: an experiment with K-means algorithm with large data mining data sets. IJCA 1(9) (2010)

    Google Scholar 

  18. Jian, H., Badam, A., Qureshi, M.k., Schwan, K.: Unified address translation for memory-mapped SSDs with FlashMap. In: Proceedings of the 42nd Annual International Symposium on Computer Architecture, ACM, pp. 580–591 (2015)

    Google Scholar 

  19. Vahalia, U.: UNIX Internals The New Frontiers, Pearson education, New Delhi, 110 017, India (1996)

    Google Scholar 

  20. Venkateswarlu, N.B.: Advanced UNIX Programming. BS publications, Hyderabad (2005)

    Google Scholar 

  21. Mmap-memory mapped file support. [online] Accessed on 18 July 2015. Available https://docs.python.org/2/library/mmap.html (2015) (c) Python Software foundation

  22. Krieger, O., Reid, K., Stumm, M.: Exploiting mapped files for parallel I/O. In: SPDP Workshop on Modeling and Specification of I/O, pp. 1–11 (1995)

    Google Scholar 

  23. Geurts, K., Wets, G., Brijs, T., Vanhoof, K.: Profiling high frequency accident locations using association rules. In: Electronic Proceedings of the 82th Annual Meeting of the Transportation Research Board, Washington, 12–16 January, USA, p. 18 (2003)

    Google Scholar 

  24. Anuradha, T., Satya Prasad, R.: Parallelizing apriori on hyper-threaded multi-core processor. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3(6), 1072–1082 (2013)

    Google Scholar 

  25. Anuradha, T.: Performance of hyper-threading on memory mapped files. Int. J. Appl. Eng. Res. 9(23), 21421–21431 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Anuradha, T. (2018). Parallel Mining of Frequent Itemsets from Memory-Mapped Files. In: Aggarwal, V., Bhatnagar, V., Mishra, D. (eds) Big Data Analytics. Advances in Intelligent Systems and Computing, vol 654. Springer, Singapore. https://doi.org/10.1007/978-981-10-6620-7_43

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6620-7_43

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6619-1

  • Online ISBN: 978-981-10-6620-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics