Skip to main content

Parallel Ordinal Decision Tree Algorithm and Its Implementation in Framework of MapReduce

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 481))

Abstract

Ordinal decision tree (ODT) can effectively deal with monotonic classification problems. However, it is difficult for the existing ordinal decision tree algorithms to learning ODT from large data sets. In order to deal with the problem of generating an ODT from large datasets, this paper presents a parallel processing mechanism in the framework of MapReduce. Similar to the general ordinal decision tree algorithms, the rank mutual information (RMI) is still used to select the extended attributes. Differing from the calculation of RMI in the previous algorithms, this paper applies a strategy of attribute parallelization to calculate the RMI. Experiments on large ordered data sets (which are generated artificially) confirm that our proposed algorithm is feasible. Experimental results show that our algorithm is effective and efficient from three aspects: speed-up, scale-up and size-up.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Potharst, R., Bioch, J.C.: Decision trees for ordinal classification. Intelligent Data Analysis 4(2), 97–111 (2000)

    MATH  Google Scholar 

  2. Hu, Q.H., Guo, M.Z., Yu, D.R., et al.: Information entropy for ordinal classification. Science China Information Sciences 53(6), 1188–1200 (2010)

    Article  MathSciNet  Google Scholar 

  3. Kufrin, R.: Decision trees on parallel processors. Machine Intelligence and Pattern Recognition 20, 279–306 (1999)

    Article  Google Scholar 

  4. Olcay, T.Y., Onur, D.: Parallel univariate decision trees. Pattern Recognition Letters 28, 825–832 (2007)

    Article  Google Scholar 

  5. Wu, G., Li, H., Hu, X., et al.: MReC4.5: C4.5 ensemble classification with MapReduce. The Fourth ChinaGrid Annual Conference, 249–255 (2009)

    Google Scholar 

  6. He, Q., Dong, Z., Zhuang, F., Shang, T., Shi, Z.: Parallel Decision Tree with Application to Water Quality Data Analysis. In: Wang, J., Yen, G.G., Polycarpou, M.M. (eds.) ISNN 2012, Part II. LNCS, vol. 7368, pp. 628–637. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  7. Yin, W., Simmhan, Y., Prasanna, V.K.: Scalable regression tree learning on Hadoop using OpenPlanet. Proceedings of third international workshop on MapReduce and its Applications. Date, 57–64 (2012)

    Google Scholar 

  8. Zhu, M., Shen, D., Yu, G., et al.: Computing the Split Points for Learning Decision Tree in MapReduce. Database Systems for Advanced Applications, Lecture Notes in Computer Science 7826, 339–353 (2013)

    Google Scholar 

  9. Sara, R., Victoria, L., Jos, M., et al.: On the use of MapReduce for imbalanced big data using Random Forest. Information Sciences 2014.03.043 (2014)

    Google Scholar 

  10. Potharst, R., Bioch, J.C.: Decision trees for ordinal classification. Intelligent Data Analysis 4(2), 97–111 (2000)

    MATH  Google Scholar 

  11. Xia, F., Zhang, W., Li, F., et al.: Ranking with decision tree. Knowledge and information systems 17(3), 381–395 (2008)

    Article  Google Scholar 

  12. Hu, Q.H., Guo, M.Z., Yu, D.R., et al.: Information entropy for ordinal classification. Science China Information Sciences 53(6), 1188–1200 (2010)

    Article  MathSciNet  Google Scholar 

  13. Hu, Q., Che, X., Zhang, L., et al.: Rank Entropy-Based Decision Trees for Monotonic Classification. IEEE Transactions on Knowledge and Data Engineering 24(11), 2052–2064 (2012)

    Article  Google Scholar 

  14. Jeffrey, D., Sanjay, G.: MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, 51(1), pp. 107–113 (January 2008)

    Google Scholar 

  15. http://hadoop.apache.org/

  16. He, Q., Shang, T.: Parallel extreme learning machine for regression based on MapReduce. Neurocomputing 102, 52–58 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junhai Zhai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, S., Zhai, J., Zhu, H., Wang, X. (2014). Parallel Ordinal Decision Tree Algorithm and Its Implementation in Framework of MapReduce. In: Wang, X., Pedrycz, W., Chan, P., He, Q. (eds) Machine Learning and Cybernetics. ICMLC 2014. Communications in Computer and Information Science, vol 481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45652-1_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-45652-1_25

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-45651-4

  • Online ISBN: 978-3-662-45652-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics