Analyzing the Effect of Database Dimensionality on Performance of Adaptive Apriori Algorithm

Patil, Shubhangi D.; Deshmukh, Ratnadeep R.; Kirange, Dnyaneshwar K.; Waghmare, Swapnil

doi:10.1007/978-981-13-1423-0_20

Shubhangi D. Patil¹⁵,
Ratnadeep R. Deshmukh¹⁶,
Dnyaneshwar K. Kirange¹⁷ &
…
Swapnil Waghmare¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 876))

Included in the following conference series:

International Conference on Smart Trends for Information Technology and Computer Communications

Abstract

Obtaining frequent itemsets from the dataset is one of the most promising areas of data mining. The Apriori algorithm is one of the most important algorithms for obtaining frequent itemsets from the dataset. But the algorithm fails in terms of time required as well as number of database scans. Hence a new improved version of Apriori is proposed in this paper which is efficient in terms of time required as well as number of database scans than the Apriori algorithm. It is well known that the size of the database for defining candidates has great effect on running time and memory need. The usefulness of the adaptive apriori algorithm in terms of dimensionality of the dataset is demonstrated. We presented experimental results, showing that the proposed algorithm always outperform Apriori. To evaluate the performance of the proposed algorithm, we have tested it on Turkey student’s database of faculty evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD 1993, New York, NY, USA, pp. 207–216 (1993)
Google Scholar
Han, J., Kamber, M.: Data Mining. Concepts and Techniques, 2nd edn. Morgan Kaufmann, Burlington (2006)
MATH Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB 1994, San Francisco, CA, USA, pp. 487–499. Morgan Kaufmann Publishers Inc. (1994)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD 2000, New York, NY, USA, pp. 1–12 (2000)
Google Scholar
Pavon, J., Viana, S., Gomez, S.: Matrix apriori: speeding up the search for frequent patterns. In: Proceedings of the 24th IASTED International Conference on Database and Applications, DBA 2006, Anaheim, CA, USA, pp. 75–82. ACTA Press (2006)
Google Scholar
Yildiz, B., Ergenc, B.: Comparison of two association rule mining algorithms without candidate generation. In: Proceedings of the 10th IASTED International Conference on Artificial Intelligence and Applications, SIGMOD 1993, pp. 450–457. ACM (2010)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithm for mining association rules. In: Proceedings of 20th International Conference on Very Large Data Bases (VLDB), pp. 487–499. Morgan Kaufman Press (1994)
Google Scholar
Park, J.S., Chen, M.S., Yu, P.S.: Efficient parallel data mining of association rules. In: 4th International Conference on Information and Knowledge Management, vol. 11, pp. 233–235 (1995)
Google Scholar
Brin, S., et a1.: Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 123–140 (1997)
Google Scholar
Dean, J., Ghemawat, S.: “Map/reduce: simplified”, data processing on large clusters. In: Sixth Symposium on Operating System Design and Implementation, OSDI 2004 (2004)
Google Scholar
Lin, M., Lee, P., Hsueh, S.: Apriori-based frequent itemset mining algorithms on MapReduce. In: Proceedings of the 16th International Conference on Ubiquitous Information Management and Communication (ICUIMC 2012). ACM, New York (2012). Article No. 76
Google Scholar
Li, N., Zeng, L., He, Q., Shi, Z.: Parallel implementation of apriori algorithm based on MapReduce. In: Proceedings of the 13th ACM International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel and Distributed Computing (SNPD 2012), pp. 236–241. IEEE, Kyoto (2012)
Google Scholar
Hammoud, S.: MapReduce network enabled algorithms for classification based on association rules, Thesis (2011)
Google Scholar
Gao, Y.: Data Processing with Spark Technology, Application and Performance Optimization, vol. 11, pp. 1–2. China Machine Press (2014)
Google Scholar
Qiu, H., Gu, R., Yuan, C., et al.: YAFIM: a parallel frequent itemset mining algorithm with spark. In: 2014 IEEE International Parallel and Distributed Processing” Symposium Workshops (IPDPSW), pp. 1664–1671. IEEE (2014)
Google Scholar
Gunduz, G., Fokoue, E.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Technology, Government Polytechnic, Jalgaon, India
Shubhangi D. Patil
Department of Computer Science and IT, Dr. Babasaheb Ambedkar Marathwada University, Aurangabaad, India
Ratnadeep R. Deshmukh & Swapnil Waghmare
Department of Computer Engineering, J. T. Mahajan College of Engineering, Faizpur, Jalgaon, India
Dnyaneshwar K. Kirange

Authors

Shubhangi D. Patil
View author publications
You can also search for this author in PubMed Google Scholar
Ratnadeep R. Deshmukh
View author publications
You can also search for this author in PubMed Google Scholar
Dnyaneshwar K. Kirange
View author publications
You can also search for this author in PubMed Google Scholar
Swapnil Waghmare
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shubhangi D. Patil .

Editor information

Editors and Affiliations

SKNCOE, Pune, Maharashtra, India
A.V. Deshpande
Department of Mechanical Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India
Aynur Unal
Department of Mathematics and Computer Science, Laurentian University, Sudbury, Ontario, Canada
Kalpdrum Passi
Namibia University of Science and Technology, Windhoek, Namibia
Dharm Singh
IT Buzz Limited, Dagenham, United Kingdom
Malaya Nayak
Yudiz Solutions, Ahmedabad, India
Bharat Patel
Sinhgad Group of Institutions, Pune, India
Shafi Pathan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Patil, S.D., Deshmukh, R.R., Kirange, D.K., Waghmare, S. (2018). Analyzing the Effect of Database Dimensionality on Performance of Adaptive Apriori Algorithm. In: Deshpande, A., et al. Smart Trends in Information Technology and Computer Communications. SmartCom 2017. Communications in Computer and Information Science, vol 876. Springer, Singapore. https://doi.org/10.1007/978-981-13-1423-0_20

Download citation

DOI: https://doi.org/10.1007/978-981-13-1423-0_20
Published: 21 August 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1422-3
Online ISBN: 978-981-13-1423-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics