Skip to main content

MapReduce Example with HBase for Association Rule

  • Conference paper
Future Information Technology

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 276))

Abstract

The paper illustrates how to store and compute association sets of Big Transaction Data using Hadoop and HBase and then, shows the experimental result of a MapReduce algorithm using HBase to find out association in transaction data, which is a Market Basket Analysis algorithm of Association Rule in Business Intelligence. The algorithm sorts and converts the transaction data of HBase to data set with (key, value) pair, and stores the associated data to the HBase. The algorithm and HBase run on Amazon EC2 service using Apache Whirr. The experimental results show that the algorithm increases the performance as adding more nodes till a certain number of transaction data. However, it loses control and connection when there are too many IOs with more than 3.5 millions of transaction data in HBase.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Woo, J., Xu, Y.: Market Basket Analysis Algorithm with Map/Reduce of Cloud Computing. In: The 2011 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2011), Las Vegas, July 18-21 (2011)

    Google Scholar 

  2. Woo, J., Basopia, S., Xu, Y., Kim, S.H.: Market Basket Analysis Algorithm with NoSQL DB HBase and Hadoop. In: The Third International Conference on Emerging Databases (EDB 2011), Songdo Park Hotel, Incheon, Korea, August 25-27 (2011)

    Google Scholar 

  3. Woo, J.: Apriori-Map/Reduce Algorithm. In: The 2012 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2012), Las Vegas, July 16-19 (2012)

    Google Scholar 

  4. Apache Hadoop Project, http://hadoop.apache.org/

  5. Apache HBase, http://hbase.apache.org/

  6. Apache Whirr, http://incubator.apache.org/whirr/

  7. Lin, J., Dyer, C.: Data-Intensive Text Processing with MapReduce. Tutorial at the 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT 2010), Los Angeles, California (June 2010)

    Google Scholar 

  8. Lin, J., Schatz, M.: Design Patterns for Efficient Graph Algorithms in MapReduce. In: Proceedings of the Eighth Workshop on Mining and Learning with Graphs Workshop (MLG-2010), Washington, D.C., pp. 78–85 (July 2010)

    Google Scholar 

  9. Lin, J., Dyer, C.: Data-Intensive Text Processing with MapReduce. Morgan & Claypool Publishers (2010)

    Google Scholar 

  10. Dean, J., Ghemawa, S.: MapReduce: Simplified Data Processing on Large Clusters. In: OSDI 2004, Google Labs, pp. 137–150 (2004)

    Google Scholar 

  11. Apache Zookeeper, http://zookeeper.apache.org

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jongwook Woo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Woo, J., Lee, K. (2014). MapReduce Example with HBase for Association Rule. In: Park, J., Stojmenovic, I., Choi, M., Xhafa, F. (eds) Future Information Technology. Lecture Notes in Electrical Engineering, vol 276. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40861-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40861-8_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40860-1

  • Online ISBN: 978-3-642-40861-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics