Advertisement

A unified distributed ELM framework with supervised, semi-supervised and unsupervised big data learning

  • Zhiqiong Wang
  • Luxuan Qu
  • Junchang Xin
  • Hongxu Yang
  • Xiaosong Gao
Regular Research Paper

Abstract

Extreme learning machine (ELM) as well as its variants have been widely used in many fields for its good generalization performance and fast learning speed. Though distributed ELM can sufficiently process large-scale labeled training data, the current technology is not able to process partial labeled or unlabeled training data. Therefore, we propose a new unified distributed ELM with supervised, semi-supervised and unsupervised learning based on MapReduce framework, called U-DELM. The U-DELM method can be used to overcome the existing distributed ELM framework’s lack of ability to process partially labeled and unlabeled training data. We first compare the computation formulas of supervised, semi-supervised and unsupervised learning methods and found that the majority of expensive computations are decomposable. Next, MapReduce framework based U-DELM is proposed, which extracts three different matrices continued multiplications from the three computational formulas introduced above. After that, we transform the cumulative sums respectively to make them suitable for MapReduce. Then, the combination of the three computational formulas are used to solve the output weight in three different learning methods. Finally, by using benchmark and synthetic datasets, we are able to test and verify the efficiency and effectiveness of U-DELM on learning massive data. Results prove that U-DELM can achieve unified distribution on supervised, semi-supervised and unsupervised learning.

Keywords

Distributed ELM Supervised learning Semi-supervised learning Unsupervised learning MapReduce 

Notes

Acknowledgements

This research was partially supported by the following foundations: the National Natural Science Foundation of China under Grant Nos. 61472069, 61402089, and U1401256. The Fundamental Research Funds for the Central Universities under Grant Nos. N161602003, N171607010, N161904001, and N160601001. The Natural Science Foundation of Liaoning Province under Grant No. 2015020553.

References

  1. 1.
    Cheng X, Liu H, Xu X, Sun F (2017) Denoising deep extreme learning machine for sparse representation. Memet Comput 9(3):199–212CrossRefGoogle Scholar
  2. 2.
    Dean J, Ghemawat S (2010) MapReduce: a flexible data processing tool. Commun ACM 53(1):72–77CrossRefGoogle Scholar
  3. 3.
    Elsayed S, Sarker R (2016) Differential evolution framework for big data optimization. Memet Comput 8(1):17–33CrossRefGoogle Scholar
  4. 4.
    Ferrucci F, Salza P, Sarro F (2017) Using hadoop MapReduce for parallel genetic algorithms: a comparison of the global, grid and island models. Evol Comput 1:421–446Google Scholar
  5. 5.
    Han M, Yang X, Jiang E (2016) An extreme learning machine based on cellular automata of edge detection for remote sensing images. Neurocomputing 198:27–34CrossRefGoogle Scholar
  6. 6.
    Hashem IAT, Anuar NB, Gani A, Yaqoob I, Xia F, Khan SU (2016) MapReduce: review and open challenges. Scientometrics 109(1):389–422CrossRefGoogle Scholar
  7. 7.
    He Q, Shang T, Zhuang F, Shi Z (2013) Parallel extreme learning machine for regression based on MapReduce. Neurocomputing 102:52–58CrossRefGoogle Scholar
  8. 8.
    Huang G, Song S, Gupta J, Wu C (2014) Semi-supervised and unsupervised extreme learning machines. IEEE Trans Cybern 44(12):2405–2417CrossRefGoogle Scholar
  9. 9.
    Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501CrossRefGoogle Scholar
  10. 10.
    Huang S, Wang B, Chen Y, Wang G, Yu G (2017) An efficient parallel method for batched OS-ELM training using MapReduce. Memet Comput 9(3):183–197CrossRefGoogle Scholar
  11. 11.
    Koh JL, Chen CC, Chan CY, Chen ALP (2017) MapReduce skyline query processing with partitioning and distributed dominance tests. Inf Sci 375:114–137CrossRefGoogle Scholar
  12. 12.
    Lai L, Qin L, Lin X, Chang L (2017) Scalable subgraph enumeration in MapReduce: a cost-oriented approach. VLDB J 26(3):421–446CrossRefGoogle Scholar
  13. 13.
    Lu W, Shen Y, Chen S, Ooi BC (2012) Efficient processing of k nearest neighbor joins using MapReduce. Proc VLDB Endow 5(10):1016–1027CrossRefGoogle Scholar
  14. 14.
    Lu X, Zou H, Zhou H, Xie L, Huang GB (2016) Robust extreme learning machine with its application to indoor positioning. IEEE Trans Cybern 46(1):194–205CrossRefGoogle Scholar
  15. 15.
    Park Y, Min JK, Shim K (2017) Efficient processing of skyline queries using MapReduce. IEEE Trans Knowl Data Eng 29(5):1031–1044CrossRefGoogle Scholar
  16. 16.
    Rizk Y, Awad M (2015) On the distributed implementation of unsupervised extreme learning machines for big data. Proc Comput Sci 53(1):167–174CrossRefGoogle Scholar
  17. 17.
    Shvachko K, Kuang H, Radia S, Chansler R (2010) The hadoop distributed file system. In: Proceedings of the 26th IEEE symposium on mass storage systems and technologies (MSST 2010). Incline Village, pp 1–10Google Scholar
  18. 18.
    Wang Z, Qu Q, Yu G, Kang Y (2016) Breast tumor detection in double views mammography based on extreme learning machine. Neural Comput Appl 27(1):227–240CrossRefGoogle Scholar
  19. 19.
    Wang Z, Xin J, Yang H, Tian S, Yu G, Xu C, Yao Y (2017) Distributed and weighted extreme learning machine for imbalanced big data learning. Tsinghua Sci Technol 22(2):160–173CrossRefzbMATHGoogle Scholar
  20. 20.
    Wang Z, Yu G, Kang Y, Zhao Y, Qu Q (2014) Breast tumor detection in digital mammography based on extreme learning machine. Neurocomputing 128:175–184CrossRefGoogle Scholar
  21. 21.
    Wong KI, Vong CM, Wong PK, Luo J (2015) Sparse Bayesian extreme learning machine and its application to biofuel engine performance prediction. Neurocomputing 149(Part A):397–404CrossRefGoogle Scholar
  22. 22.
    Xin J, Wang Z, Chen C, Ding L, Wang G, Zhao Y (2013) ELM*: distributed extreme learning machine with MapReduce. World Wide Web 17(5):1189–1204CrossRefGoogle Scholar
  23. 23.
    Xin J, Wang Z, Qu L, Wang G (2015) Elastic extreme learning machine for big data classification. Neurocomputing 149(Part A):464–471CrossRefGoogle Scholar
  24. 24.
    Xin J, Wang Z, Qu L, Yu G, Kang Y (2016) A-ELM*: adaptive distributed extreme learning machine with MapReduce. Neurocomputing 174(Part A):368–374CrossRefGoogle Scholar
  25. 25.
    Zhao Y, Wang G, Yin Y, Li Y, Wang Z (2016) Improving ELM-based microarray data classification by diversified sequence features selection. Neural Comput Appl 27(1):155–166CrossRefGoogle Scholar
  26. 26.
    Zong W, Huang GB, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101:229–242CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Sino-Dutch Biomedical and Information Engineering SchoolNortheastern UniversityShenyangChina
  2. 2.School of Computer Science and Engineering, Key Laboratory of Big Data Management and Analytics (Liaoning Province)Northeastern UniversityShenyangChina

Personalised recommendations