Skip to main content

Analyzing the Effect of Database Dimensionality on Performance of Adaptive Apriori Algorithm

  • Conference paper
  • First Online:
Smart Trends in Information Technology and Computer Communications (SmartCom 2017)

Abstract

Obtaining frequent itemsets from the dataset is one of the most promising areas of data mining. The Apriori algorithm is one of the most important algorithms for obtaining frequent itemsets from the dataset. But the algorithm fails in terms of time required as well as number of database scans. Hence a new improved version of Apriori is proposed in this paper which is efficient in terms of time required as well as number of database scans than the Apriori algorithm. It is well known that the size of the database for defining candidates has great effect on running time and memory need. The usefulness of the adaptive apriori algorithm in terms of dimensionality of the dataset is demonstrated. We presented experimental results, showing that the proposed algorithm always outperform Apriori. To evaluate the performance of the proposed algorithm, we have tested it on Turkey student’s database of faculty evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD 1993, New York, NY, USA, pp. 207–216 (1993)

    Google Scholar 

  2. Han, J., Kamber, M.: Data Mining. Concepts and Techniques, 2nd edn. Morgan Kaufmann, Burlington (2006)

    MATH  Google Scholar 

  3. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB 1994, San Francisco, CA, USA, pp. 487–499. Morgan Kaufmann Publishers Inc. (1994)

    Google Scholar 

  4. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD 2000, New York, NY, USA, pp. 1–12 (2000)

    Google Scholar 

  5. Pavon, J., Viana, S., Gomez, S.: Matrix apriori: speeding up the search for frequent patterns. In: Proceedings of the 24th IASTED International Conference on Database and Applications, DBA 2006, Anaheim, CA, USA, pp. 75–82. ACTA Press (2006)

    Google Scholar 

  6. Yildiz, B., Ergenc, B.: Comparison of two association rule mining algorithms without candidate generation. In: Proceedings of the 10th IASTED International Conference on Artificial Intelligence and Applications, SIGMOD 1993, pp. 450–457. ACM (2010)

    Google Scholar 

  7. Agrawal, R., Srikant, R.: Fast algorithm for mining association rules. In: Proceedings of 20th International Conference on Very Large Data Bases (VLDB), pp. 487–499. Morgan Kaufman Press (1994)

    Google Scholar 

  8. Park, J.S., Chen, M.S., Yu, P.S.: Efficient parallel data mining of association rules. In: 4th International Conference on Information and Knowledge Management, vol. 11, pp. 233–235 (1995)

    Google Scholar 

  9. Brin, S., et a1.: Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 123–140 (1997)

    Google Scholar 

  10. Dean, J., Ghemawat, S.: “Map/reduce: simplified”, data processing on large clusters. In: Sixth Symposium on Operating System Design and Implementation, OSDI 2004 (2004)

    Google Scholar 

  11. Lin, M., Lee, P., Hsueh, S.: Apriori-based frequent itemset mining algorithms on MapReduce. In: Proceedings of the 16th International Conference on Ubiquitous Information Management and Communication (ICUIMC 2012). ACM, New York (2012). Article No. 76

    Google Scholar 

  12. Li, N., Zeng, L., He, Q., Shi, Z.: Parallel implementation of apriori algorithm based on MapReduce. In: Proceedings of the 13th ACM International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel and Distributed Computing (SNPD 2012), pp. 236–241. IEEE, Kyoto (2012)

    Google Scholar 

  13. Hammoud, S.: MapReduce network enabled algorithms for classification based on association rules, Thesis (2011)

    Google Scholar 

  14. Gao, Y.: Data Processing with Spark Technology, Application and Performance Optimization, vol. 11, pp. 1–2. China Machine Press (2014)

    Google Scholar 

  15. Qiu, H., Gu, R., Yuan, C., et al.: YAFIM: a parallel frequent itemset mining algorithm with spark. In: 2014 IEEE International Parallel and Distributed Processing” Symposium Workshops (IPDPSW), pp. 1664–1671. IEEE (2014)

    Google Scholar 

  16. Gunduz, G., Fokoue, E.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shubhangi D. Patil .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Patil, S.D., Deshmukh, R.R., Kirange, D.K., Waghmare, S. (2018). Analyzing the Effect of Database Dimensionality on Performance of Adaptive Apriori Algorithm. In: Deshpande, A., et al. Smart Trends in Information Technology and Computer Communications. SmartCom 2017. Communications in Computer and Information Science, vol 876. Springer, Singapore. https://doi.org/10.1007/978-981-13-1423-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-1423-0_20

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-1422-3

  • Online ISBN: 978-981-13-1423-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics