DCE -miner: an association rule mining algorithm for multimedia based on the MapReduce framework

Chengyan, LI; FENG, Shixiang; SUN, Guanglu

doi:10.1007/s11042-019-08361-y

DCE -miner: an association rule mining algorithm for multimedia based on the MapReduce framework

Published: 07 June 2020

Volume 79, pages 16771–16793, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

LI Chengyan¹,
Shixiang FENG¹ &
Guanglu SUN¹

271 Accesses
8 Citations
Explore all metrics

Abstract

The amount of multimedia data has grown rapidly because of improvements in data collection and storage technologies. The association rule mining (ARM) technique is a type of data mining method widely used to extract useful information from data warehouses. In real-world big data applications, fast and effective data mining algorithms are emerging as a valuable approach. In this paper, we propose DCE-Miner, a fast association rule mining algorithm with low memory requirements based on the MapReduce framework. In the precomputation phase, we split large datasets into equal-sized smaller ones using data division method. In the frequent K-itemsets mining phase, the mappers read the small datasets and distribute the data to reducers based on the closed set characteristics associated with each partition. The reducers use bitmaps to accelerate the computation speed and store the possible frequent 2-itemsets to reduce future computation. Extensive experimental results show that on large-scale datasets with up to 40 million transactions, DCE-Miner achieves better performance and is more robust with respect to dataset sizes and support level than are the current algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Trends and Future Perspective Challenges in Big Data

Big Data Analytics: A Literature Review Paper

A comprehensive survey of data mining

Article 06 February 2020

References

Bhatt CA, Kankanhalli MS (2011) Multimedia data mining: state of the art and challenges. Multimed Tools Appl 51(1):35–76
Article Google Scholar
Tsai CF, Chen MY (2010) Variable selection by association rules for customer churn prediction of multimedia on demand. Expert Syst Appl 37(3):2006–2015
Article Google Scholar
Yang Y, Huang Z, Shen HT et al (2011) Mining multi-tag association for image tagging. World Wide Web. https://doi.org/10.1007/s11280-010-0099-8
Oswald C, Sivaselvan B, Ambient J (2018) An optimal text compression algorithm based on frequent pattern mining. Intell Human Comput 9:803–822. https://doi.org/10.1007/s12652-017-0540-2
Article Google Scholar
Güder M, Çiçekli NK (2018) Multi-modal video event recognition based on association rules and decision fusion. Multimedia Systems 24:55–72. https://doi.org/10.1007/s00530-017-0535-z
Article Google Scholar
Liu S, Bai W, Liu G, Li W, Srivastava HM (2018) Parallel fractal compression method for big video data. Complexity. https://doi.org/10.1155/2018/2016976
Liu S, Pan Z, Cheng X (2017) A novel fast fractal image compression method based on distance clustering in high dimensional sphere surface. Fractals. https://doi.org/10.1142/S0218348X17400047
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY et al (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37
Article Google Scholar
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases. IEEE, Santiago
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, May 16-18, 2000, Dallas, Texas, USA. ACM
Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD’97). AAAI Press, pp 283–286
Fournier-Viger P, Lin JCW, Vo B, Chi TT, Zhang J, Le HB (2017) A survey of itemset mining. Wiley Interdiscip. Rev Data Min Knowl Discov 7(4):1–18
Google Scholar
Gatuha G, Jiang T. (2017) Smart frequent itemsets mining algorithm based on FP-tree and DIFFset data structures. Turk J Electr Eng Comput Sci 25:2096–2107
Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using FP-trees. IEEE Trans Knowl Data Eng 17(10):1347–1362
Article Google Scholar
Verma N, Singh J (2017) An intelligent approach to big data analytics for sustainable retail environment using apriori–map reduce framework. Ind Manag Data Syst 117(7):1503–1520
Article Google Scholar
Yan X, Zhang J, Xun Y, Qin X (2017) A parallel algorithm for mining constrained frequent patterns using mapreduce. Soft Comput 21:2237–2249
Article Google Scholar
Chon KW, Kim MS (2018) BIGMiner: a fast and scalable distributed frequent pattern miner for big data. Clust Comput 1:1–14
Google Scholar
Li H, Wang Y, Zhang D, Zhang M, Chang EY (2008) Pfp: Parallel fp-growth for query recommendation. Proceedings of the 2008 ACM Conference on Recommender Systems, RecSys 2008, Lausanne, Switzerland, October 23-25, 2008. ACM
Padillo F, Luna JM, Herrera F et al (2017) Mining association rules on big data through MapReduce genetic programming. Integrated Comput Aided Eng 25(2):1–19
Google Scholar
Dean J, Ghemawa S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Article Google Scholar
Zeng Y, Yin S et al (2015) Research of improved FP-growth algorithm in association rules mining. Sci Program. https://doi.org/10.1155/2015/910281
Lin X (2014) MR-Apriori: Association rules algorithm based on MapReduce. In: 2014 5th IEEE International Conference on Software Engineering and Service Science (ICSESS). IEEE, Beijing, China
Chavan K, Kulkarni P, Ghodekar P, Patil SN (2015) Frequent itemset mining for Big data. In: 2015 International Conference on Green Computing and Internet of Things (ICGCIoT), Noida, pp 1365–1368. https://doi.org/10.1109/ICGCIoT.2015.7380679
Lin M-Y, Lee P-Y, Hsueh S-C (2012) Apriori-based frequent itemset mining algorithms on MapReduce. In Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication (ICUIMC ’12). Association for Computing Machinery, New York, NY, USA, Article 76, 1–8. https://doi.org/10.1145/2184751.2184842
Wang L (2014) An efficient algorithm of frequent Itemsets mining based on MapReduce. J Inf Comput Sci 11(8):2809–2816
Chua T-S, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR ’09). Association for Computing Machinery, New York, NY, USA, Article 48, 1–9. https://doi.org/10.1145/1646396.1646452
Qu Z, Guo L, Chen Q et al (2013) Intelligent dispatching lossless cluster compression technology based on Hadoop cloud framework. Autom Electr Power Syst 37:93–98. https://doi.org/10.7500/AEPS201301138
Article Google Scholar
Tang Z, Wang W, Sun L, Huang Y, Wu H, Wei J, Huang T (2018) IO dependent SSD cache allocation for elastic Hadoop applications. Science China Inf Sci 61:1–17. https://doi.org/10.1007/s11432-017-9401-y
Article Google Scholar
Rathore MM, Son H, Ahmad A et al (2018) Real-time big data stream processing using GPU with spark over Hadoop ecosystem. Int J Parallel Prog 46(3):1–17
Article Google Scholar
Djenouri Y, Djenouri D, Habbas Z et al (2018) How to exploit high performance computing in population-based metaheuristics for solving association rule mining problem. Distrib Parallel Databases 3:1–29
Google Scholar

Download references

Acknowledgements

Supported by the Fundamental Research Fundation for Universities of Heilongjiang Province (JMRH2018XM04) and Natural Science Foundation of Heilongjiang Province of China (LC2018030).

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, 150080, China
LI Chengyan, Shixiang FENG & Guanglu SUN

Authors

LI Chengyan
View author publications
You can also search for this author in PubMed Google Scholar
Shixiang FENG
View author publications
You can also search for this author in PubMed Google Scholar
Guanglu SUN
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guanglu SUN.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chengyan, L., FENG, S. & SUN, G. DCE -miner: an association rule mining algorithm for multimedia based on the MapReduce framework. Multimed Tools Appl 79, 16771–16793 (2020). https://doi.org/10.1007/s11042-019-08361-y

Download citation

Received: 09 January 2019
Revised: 22 August 2019
Accepted: 09 October 2019
Published: 07 June 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s11042-019-08361-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DCE -miner: an association rule mining algorithm for multimedia based on the MapReduce framework

Abstract

Access this article

Similar content being viewed by others

Trends and Future Perspective Challenges in Big Data

Big Data Analytics: A Literature Review Paper

A comprehensive survey of data mining

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DCE -miner: an association rule mining algorithm for multimedia based on the MapReduce framework

Abstract

Access this article

Similar content being viewed by others

Trends and Future Perspective Challenges in Big Data

Big Data Analytics: A Literature Review Paper

A comprehensive survey of data mining

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation