Skip to main content

Mining Perfectly Rare Itemsets on Big Data: An Approach Based on Apriori-Inverse and MapReduce

  • Conference paper
  • First Online:
Intelligent Systems Design and Applications (ISDA 2016)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 557))

  • 1660 Accesses

Abstract

Association rule mining is one of the most common data mining techniques used to identify and describe interesting relationships between patterns from large quantities of data. Whereas many researches have been focused on the extraction of these patterns which appear frequently to obtain general information, in some scenarios it could also be interesting to extract unexpected phenomena. Rare association rule mining is a recent field aiming to discover sporadic rules having a low frequency of appearance but high confidence of occurring together. This field is really useful over Big Data where abnormal endeavor are more interesting than normal behavior. In this sense, our aim is to propose a new algorithm to obtain rare association rule on Big Data using MapReduce by means of Spark and Hadoop. The experimental study includes more than 30 datasets revealing alluring results in efficiency when more than 60, 000 million of instances and file sizes of 500 GBytes are considered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ventura, S., Luna, J.M.: Pattern Mining with Evolutionary Algorithms. Springer International Publishing, Switzerland (2016)

    Google Scholar 

  2. Koh, Y.S., Rountree, N.: Finding sporadic rules using apriori-inverse. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 97–106. Springer, Heidelberg (2005). doi:10.1007/11430919_13

    Chapter  Google Scholar 

  3. Romero, C., Luna, J.M., Romero, J.R., Ventura, S.: Mining rare association rules from e-Learning data. In: Proceedings of the 3rd International Conference on Educational Data Mining, EDM 2010, pp. 171–180 (2010)

    Google Scholar 

  4. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008). 50th anniversary issue

    Article  Google Scholar 

  5. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. SIGMOD Rec. 22(2), 207–216 (1993)

    Article  Google Scholar 

  6. Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2007, Washington, DC, USA, pp. 305–312. IEEE Computer Society (2007)

    Google Scholar 

  7. Padillo, F., Luna, J.M., Ventura, S.: Subgroup discovery on big data: exhaustive methodologies using map-reduce. In: Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, Tianjin, China, pp. 1684–1691, August 2016

    Google Scholar 

  8. Padillo, F., Luna, J.M., Cano, A., Ventura, S.: A data structure to speed-up machine learning algorithms on massive datasets. In: Proceedings of the 11th International Conference on Hybrid Artificial Intelligent Systems, pp. 365–376 (2016)

    Google Scholar 

  9. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: Cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud 2010, Berkeley, CA, USA (2010)

    Google Scholar 

  10. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2011)

    MATH  Google Scholar 

Download references

Acknowledgments

This research was supported by the Spanish Ministry of Economy and Competitiveness, project TIN-2014-55252-P, and by FEDER funds.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Ventura .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Padillo, F., Luna, J.M., Ventura, S. (2017). Mining Perfectly Rare Itemsets on Big Data: An Approach Based on Apriori-Inverse and MapReduce. In: Madureira, A., Abraham, A., Gamboa, D., Novais, P. (eds) Intelligent Systems Design and Applications. ISDA 2016. Advances in Intelligent Systems and Computing, vol 557. Springer, Cham. https://doi.org/10.1007/978-3-319-53480-0_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-53480-0_50

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-53479-4

  • Online ISBN: 978-3-319-53480-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics