Parallel Mining of Frequent Itemsets from Memory-Mapped Files

Anuradha, T.

doi:10.1007/978-981-10-6620-7_43

Parallel Mining of Frequent Itemsets from Memory-Mapped Files

T. Anuradha¹⁷

Conference paper
First Online: 04 October 2017

3925 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 654))

Abstract

Due to digitization of data in different fields, data are increasing in leaps and bounds. Mining of these large amounts of data requires two major issues to deal with. The first is the potential to deal with huge data which can be dealt with parallel algorithms as serial algorithms may take very long time or sometimes may not process. The second is the I/O overhead which can be dealt with memory mapping of files. This chapter brings together both parallelization and memory mapping of files concepts in mining the frequent itemsets. Our experiments proved that there is almost 20% more speedup on parallelizing our frequent itemset mining algorithm with memory mapping when compared to conventional I/O without memory mapping.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94), Santiago, Chile, vol. 1215, pp. 487–499 (1994)
Google Scholar
Sivanandam, S.N., Sumathi, S.: Data Mining Concepts, Tasks. Techniques Thomson Business Information Pvt. Ltd., India (2006)
Google Scholar
Radha, T.A., Lavanya, P.: Recent trends in parallel and distributed apriori algorithm. Int. J. Eng. Res. Appl. 1(4), 1820–1822 (2000)
Google Scholar
Anuradha, T., Kranthi, M., Saragini, M.: Recent trends in parallel classification and clustering data mining. Glob. J. Comput. Appl. Technol. 1(4), 617–619 (2011)
Google Scholar
Barbic, J.: Multi-core architectures—Lecture Notes [Online]. Available http://www.co-array.org/cafvsmpi.Htm (2007)
Akhter, S., Roberts, J.: Multi-core Programming, vol. 33. Intel Press (2006)
Google Scholar
Packirisamy, V., Barathvajasankar, H.: Openmp in Multicore Architectures. University of Minnesota, Tech, Rep (2005)
Google Scholar
Vu, Lan, Alaghband, Gita: Novel parallel method for association rule mining on multi-core shared memory systems. Parallel Comput. 40(10), 768–785 (2014)
Article Google Scholar
Heidemann, J.: Performance interactions between P-HTTP and TCP implementations. ACM SIGCOMM Comput. Commun. Rev. 27(2), 65–73 (1997)
Article Google Scholar
Tevanian, A., Rashid, R.F., Young, M., Golub, D.B., Thompson, M.R., Bolosky, W.J., Sanzi, R.: A UNIX interface for shared memory and memory mapped files under mach. In: USENIX Summer, pp. 53–68 (1987)
Google Scholar
Love, R.: Linux System Programming, 2nd edn. O’Reilly Media, Inc (2007)
Google Scholar
Anuradha, T., Satya Prasad, R., Tirumalarao, S.N.: Parallelizing apriori on dual core using OpenMP. Int. J. Comput. Appl. 43(24), 33–39 (2012)
Google Scholar
Anuradha, T., Satya Prasad, R., Tirumalarao, S.N.: Performance evaluation of apriori on dual core with multiple threads. Int. J. Comput. Appl. 50(16), 9–16 (2012)
Google Scholar
Anuradha, T., Satya Prasad, R., Tirumala Rao, S.N.: Performance evaluation of apriori with memory mapped files. Int. J. Comput. Sci. Issues 10 1(1), 162–169 (2013)
Google Scholar
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann (2006)
Google Scholar
Bodon, F.: A fast apriori implementation. In: Goethals, B., Zaki, M.J. (Eds.) Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (FIMI’03), pp. 1–10 (2003)
Google Scholar
Tirumala Rao, S.N., Prasad, E.V., Venkateswrlu, N.B.: A critical performance study of memory mapping on multi-core processors: an experiment with K-means algorithm with large data mining data sets. IJCA 1(9) (2010)
Google Scholar
Jian, H., Badam, A., Qureshi, M.k., Schwan, K.: Unified address translation for memory-mapped SSDs with FlashMap. In: Proceedings of the 42nd Annual International Symposium on Computer Architecture, ACM, pp. 580–591 (2015)
Google Scholar
Vahalia, U.: UNIX Internals The New Frontiers, Pearson education, New Delhi, 110 017, India (1996)
Google Scholar
Venkateswarlu, N.B.: Advanced UNIX Programming. BS publications, Hyderabad (2005)
Google Scholar
Mmap-memory mapped file support. [online] Accessed on 18 July 2015. Available https://docs.python.org/2/library/mmap.html (2015) (c) Python Software foundation
Krieger, O., Reid, K., Stumm, M.: Exploiting mapped files for parallel I/O. In: SPDP Workshop on Modeling and Specification of I/O, pp. 1–11 (1995)
Google Scholar
Geurts, K., Wets, G., Brijs, T., Vanhoof, K.: Profiling high frequency accident locations using association rules. In: Electronic Proceedings of the 82th Annual Meeting of the Transportation Research Board, Washington, 12–16 January, USA, p. 18 (2003)
Google Scholar
Anuradha, T., Satya Prasad, R.: Parallelizing apriori on hyper-threaded multi-core processor. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3(6), 1072–1082 (2013)
Google Scholar
Anuradha, T.: Performance of hyper-threading on memory mapped files. Int. J. Appl. Eng. Res. 9(23), 21421–21431 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of ECM, KL University, Guntur, India
T. Anuradha

Authors

T. Anuradha
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Jagan Institute of Management Studies, New Delhi, Delhi, India
V. B. Aggarwal
Department of Computer Science, University of Delhi, New Delhi, Delhi, India
Vasudha Bhatnagar
Microsoft Innovation Centre, Sri Aurobindo Institute of Technology, Indore, Madhya Pradesh, India
Durgesh Kumar Mishra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Anuradha, T. (2018). Parallel Mining of Frequent Itemsets from Memory-Mapped Files. In: Aggarwal, V., Bhatnagar, V., Mishra, D. (eds) Big Data Analytics. Advances in Intelligent Systems and Computing, vol 654. Springer, Singapore. https://doi.org/10.1007/978-981-10-6620-7_43

Download citation

DOI: https://doi.org/10.1007/978-981-10-6620-7_43
Published: 04 October 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6619-1
Online ISBN: 978-981-10-6620-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics