Abstract
Obtaining frequent itemsets from the dataset is one of the most promising areas of data mining. The Apriori algorithm is one of the most important algorithms for obtaining frequent itemsets from the dataset. But the algorithm fails in terms of time required as well as number of database scans. Hence a new improved version of Apriori is proposed in this paper which is efficient in terms of time required as well as number of database scans than the Apriori algorithm. It is well known that the size of the database for defining candidates has great effect on running time and memory need. The usefulness of the adaptive apriori algorithm in terms of dimensionality of the dataset is demonstrated. We presented experimental results, showing that the proposed algorithm always outperform Apriori. To evaluate the performance of the proposed algorithm, we have tested it on Turkey student’s database of faculty evaluations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD 1993, New York, NY, USA, pp. 207–216 (1993)
Han, J., Kamber, M.: Data Mining. Concepts and Techniques, 2nd edn. Morgan Kaufmann, Burlington (2006)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB 1994, San Francisco, CA, USA, pp. 487–499. Morgan Kaufmann Publishers Inc. (1994)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD 2000, New York, NY, USA, pp. 1–12 (2000)
Pavon, J., Viana, S., Gomez, S.: Matrix apriori: speeding up the search for frequent patterns. In: Proceedings of the 24th IASTED International Conference on Database and Applications, DBA 2006, Anaheim, CA, USA, pp. 75–82. ACTA Press (2006)
Yildiz, B., Ergenc, B.: Comparison of two association rule mining algorithms without candidate generation. In: Proceedings of the 10th IASTED International Conference on Artificial Intelligence and Applications, SIGMOD 1993, pp. 450–457. ACM (2010)
Agrawal, R., Srikant, R.: Fast algorithm for mining association rules. In: Proceedings of 20th International Conference on Very Large Data Bases (VLDB), pp. 487–499. Morgan Kaufman Press (1994)
Park, J.S., Chen, M.S., Yu, P.S.: Efficient parallel data mining of association rules. In: 4th International Conference on Information and Knowledge Management, vol. 11, pp. 233–235 (1995)
Brin, S., et a1.: Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 123–140 (1997)
Dean, J., Ghemawat, S.: “Map/reduce: simplified”, data processing on large clusters. In: Sixth Symposium on Operating System Design and Implementation, OSDI 2004 (2004)
Lin, M., Lee, P., Hsueh, S.: Apriori-based frequent itemset mining algorithms on MapReduce. In: Proceedings of the 16th International Conference on Ubiquitous Information Management and Communication (ICUIMC 2012). ACM, New York (2012). Article No. 76
Li, N., Zeng, L., He, Q., Shi, Z.: Parallel implementation of apriori algorithm based on MapReduce. In: Proceedings of the 13th ACM International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel and Distributed Computing (SNPD 2012), pp. 236–241. IEEE, Kyoto (2012)
Hammoud, S.: MapReduce network enabled algorithms for classification based on association rules, Thesis (2011)
Gao, Y.: Data Processing with Spark Technology, Application and Performance Optimization, vol. 11, pp. 1–2. China Machine Press (2014)
Qiu, H., Gu, R., Yuan, C., et al.: YAFIM: a parallel frequent itemset mining algorithm with spark. In: 2014 IEEE International Parallel and Distributed Processing” Symposium Workshops (IPDPSW), pp. 1664–1671. IEEE (2014)
Gunduz, G., Fokoue, E.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Patil, S.D., Deshmukh, R.R., Kirange, D.K., Waghmare, S. (2018). Analyzing the Effect of Database Dimensionality on Performance of Adaptive Apriori Algorithm. In: Deshpande, A., et al. Smart Trends in Information Technology and Computer Communications. SmartCom 2017. Communications in Computer and Information Science, vol 876. Springer, Singapore. https://doi.org/10.1007/978-981-13-1423-0_20
Download citation
DOI: https://doi.org/10.1007/978-981-13-1423-0_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1422-3
Online ISBN: 978-981-13-1423-0
eBook Packages: Computer ScienceComputer Science (R0)