International Journal of Fuzzy Systems

, Volume 21, Issue 3, pp 809–822 | Cite as

Hybrid Parallel Linguistic Fuzzy Rules with Canopy MapReduce for Big Data Classification in Cloud

  • V. VennilaEmail author
  • A. Rajiv Kannan


With the increasing availability of large amount of information and the benefits related to data processing, big data have gained large significance in recent years. With scalable nature of data, big data applications are processed using MapReduce programming model. However, the application of rule-based models in datasets is not straightforward and big data are not classified in an efficient manner. To overcome the above-mentioned problems, parallel linguistic fuzzy rule with canopy MapReduce (LFR-CM) framework is introduced. LFR-CM framework classifies big data using canopy MapReduce function for information sharing in cloud with higher classification accuracy and lesser time consumption. It comprises three steps for efficient classification in cloud environment. Initially, it constructs the fuzzy knowledge base (KB) from the big data training set where linguistic fuzzy rules are constructed. The second step in LFR-CM framework has three operations. The first operation is map function used in parallel manner through every cloud user without transmitting any data to other cloud user nodes. The second operation is processing of data through the map function across all additional cloud user nodes. The third operation is reduce function deployed by each cloud user through the partitioned information. Finally, by this way, the data classification is performed with higher classification accuracy and lesser time consumption. LFR-CM framework is implemented and evaluated on Amazon EC2 cloud big data datasets and compared with the other classification system that utilizes MapReduce in terms of the runtime, classification time, classification accuracy and input/output cost. Based on the results observed from the study, LFR-CM framework is more efficient than the existing methods.


Big data Cloud environment MapReduce Linguistic fuzzy rules Canopy fuzzy MapReduce 

List of Symbols


Cloud servers


Cloud users


Fuzzy rules


Antecedent fuzzy set


Class label


Rule weight


Membership function

\(C_{\rm mn}\)

Cloud master node


Map function


Mapping threshold factor


Training set


Classification time


Classification accuracy


Data correctly classified


Number of data


Knowledge base


Number of instances

\(C_{\rm wn}\)

Cloud worker nodes


  1. 1.
    Ayma, V.A., Ferreira, R.S., Happ, P., Oliveira, D., Feitosa, R., Costa, G., Gamba, P.: Classification algorithms for big data analysis, a MapReduce approach. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. 40(3), 17–21 (2015)CrossRefGoogle Scholar
  2. 2.
    Cao, J., Cui, H., Shi, H., Jiao, L.: Big data: a parallel particle swarm optimization-back-propagation neural network algorithm based on mapreduce. PloS One 11(6), e0157551 (2015)CrossRefGoogle Scholar
  3. 3.
    Chandak, M.B.: Role of big-data in classification and novel class detection in data streams. J. Big Data 3(1), 5 (2015)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Del Río, S., López, V., Benítez, J.M., Herrera, F.: On the use of MapReduce for imbalanced big data using random forest. Inf. Sci. 285, 112–137 (2014)CrossRefGoogle Scholar
  5. 5.
    Gao, F., Mei, J., Sun, J., Wang, J., Yang, E., Hussain, A.: A novel classification algorithm based on incremental semi-supervised support vector machine. PloS One 10(8), e0135709 (2015)CrossRefGoogle Scholar
  6. 6.
    Bhadani, A., Jothimani, D.: Big data: challenges, opportunities, and realities. Eff. Big Data Manag. Oppor. Implement. 1–24 (2017)Google Scholar
  7. 7.
    Ishibuchi, H., Yamamoto, T.: Rule weight specification in fuzzy rule-based classification systems. IEEE Trans. Fuzzy Syst. 13(4), 428–435 (2005)CrossRefGoogle Scholar
  8. 8.
    Kamal, S., Ripon, S.H., Dey, N., Ashour, A.S., Santhi, V.: A MapReduce approach to diminish imbalance parameters for big deoxyribonucleic acid dataset. Comput. Methods Programs Biomed. 131, 191–206 (2016)CrossRefGoogle Scholar
  9. 9.
    Kashyap, H., Ahmed, H.A., Hoque, N., Roy, S., Bhattacharyya, D.K.: Big data analytics in bioinformatics: a machine learning perspective. arXiv preprint arXiv:1506.05101 (2015)
  10. 10.
    Li, L., Xu, J., Xiao, W., Ge, B.: Behavior based social dimensions extraction for multi-label classification. PLoS One 11(4), e0152857 (2016)CrossRefGoogle Scholar
  11. 11.
    Liu, H., Gegov, A., Stahl, F.: J-measure based hybrid pruning for complexity reduction in classification rules. WSEAS Trans. Syst. 12(9), 433–446 (2013)Google Scholar
  12. 12.
    Olshannikova, E., Ometov, A., Koucheryavy, Y., Olsson, T.: Visualizing big data with augmented and virtual reality: challenges and research agenda. J. Big Data 2(1), 1–27 (2015)CrossRefGoogle Scholar
  13. 13.
    Peng, X., Liu, C.: Algorithms for neutrosophic soft decision making based on EDAS, new similarity measure and level soft set. J. Intell. Fuzzy Syst. 32(1), 955–968 (2017)CrossRefzbMATHGoogle Scholar
  14. 14.
    Peng, X., Selvachandran, G.: Pythagorean fuzzy set: state of the art and future directions. Artif. Intell. Rev. (2017). Google Scholar
  15. 15.
    Peng, X., Yang, Y.: Algorithms for interval-valued fuzzy soft sets in stochastic multi-criteria decision making based on regret theory and prospect theory with combined weight. Appl. Soft Comput. 54, 415–430 (2017)CrossRefGoogle Scholar
  16. 16.
    Peng, X., Yang, Y.: Some results for pythagorean fuzzy sets. Int. J. Intell. Syst. 30(11), 1133–1160 (2015)CrossRefGoogle Scholar
  17. 17.
    Pramanik, T., Samanta, S., Pal, M., Mondal, S., Sarkar, B.: Interval-valued fuzzy ϕ-tolerance competition graphs. Springer 5, 1–19 (2016)CrossRefGoogle Scholar
  18. 18.
    Preoţiuc-Pietro, D., Volkova, S., Lampos, V., Bachrach, Y., Aletras, N.: Studying user income through language, behaviour and affect in social media. PLoS One 10(9), e0138717 (2015)CrossRefGoogle Scholar
  19. 19.
    Rahman, M.N., Esmailpour, A.: A hybrid data center architecture for big data. Big Data Res. 3, 29–40 (2016)CrossRefGoogle Scholar
  20. 20.
    Razzaghi, T., Roderick, O., Safro, I., Marko, N.: Multilevel weighted support vector machine for classification on healthcare data with missing values. PLoS ONE 11(5), e0155119 (2016)CrossRefGoogle Scholar
  21. 21.
    Samanta, S., Sarkar, B.: Generalized fuzzy Euler graphs and generalized fuzzy Hamiltonian graphs. J. Intell. Fuzzy Syst. 35(3), 3413–3419 (2018)CrossRefGoogle Scholar
  22. 22.
    Samanta, S., Sarkar, B.: Representation of competitions by generalized fuzzy graphs. Int. J. Comput. Intell. Syst. 11(1), 1005–1015 (2018)CrossRefGoogle Scholar
  23. 23.
    Samanta, S., Pramanik, T., Sarkar, B., Pal, M.: Fuzzy φ-tolerance competition graphs. Soft. Comput. 21(13), 3723–3734 (2017)CrossRefzbMATHGoogle Scholar
  24. 24.
    Sarkar, B., Samanta, S.: Generalized fuzzy trees. Int. J. Comput. Intell. Syst. 10(1), 711–720 (2017)CrossRefGoogle Scholar
  25. 25.
    Sarkar, B., Mahapatra, A.S.: Periodic review fuzzy inventory models with variable lead time and fuzzy demand. Int. Trans. Oper. Res. 24(5), 1197–1227 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Singh, D., Roy, D., Mohan, C.K.: DiP-SVM: distribution preserving kernel support vector machine for big data. IEEE Trans. Big Data 3(1), 79–90 (2017)CrossRefGoogle Scholar
  27. 27.
    Soni, H.N., Sarkar, B., Joshi, M.: Demand uncertainty and learning in fuzziness in a continuous review inventory model. J. Intell. Fuzzy Syst. 33(4), 2595–2608 (2017)CrossRefzbMATHGoogle Scholar
  28. 28.
    Souliotis, K., Kani, C., Papageorgiou, M., Lionis, D., Gourgoulianis, K.: Using big data to assess prescribing patterns in Greece: the case of chronic obstructive pulmonary disease. PLoS ONE 11(5), e0154960 (2016)CrossRefGoogle Scholar
  29. 29.
    Sug, H.: Applying randomness effectively based on random forests for classification task of datasets of insufficient information. J. Appl. Math. 2012, 13 (2012)CrossRefzbMATHGoogle Scholar
  30. 30.
    Suthaharan, S.: Machine learning models and algorithms for big data classification, vol. 36. Springer, Boston (2016)zbMATHGoogle Scholar
  31. 31.
    Tcheng, D.K., Nayak, A.K., Fowlkes, C.C., Punyasena, S.W.: Visual recognition software for binary classification and its application to spruce pollen identification. PLoS ONE 11(2), e0148879 (2016)CrossRefGoogle Scholar
  32. 32.
    Triguero, I., Peralta, D., Bacardit, J., García, S., Herrera, F.: MRPR: a MapReduce solution for prototype reduction in big data classification. Neurocomputing 150, 331–345 (2015)CrossRefGoogle Scholar
  33. 33.
    Wu, C.J., Ku, C.F., Ho, J.M., Chen, M.S.: A novel pipeline approach for efficient big data broadcasting. IEEE Trans. Knowl. Data Eng. 28(1), 17–28 (2016)CrossRefGoogle Scholar
  34. 34.
    Yang, C., Huang, Q., Li, Z., Liu, K., Hu, F.: Big data and cloud computing: innovation opportunities and challenges. Int. J. Digit. Earth 10(1), 13–53 (2017)CrossRefGoogle Scholar
  35. 35.
    Yun, X., Wu, G., Zhang, G., Li, K., Wang, S.: FastRAQ: a fast approach to range-aggregate queries in big data environments. IEEE Trans. Cloud Comput. 3(2), 206–218 (2015)CrossRefGoogle Scholar

Copyright information

© Taiwan Fuzzy Systems Association 2019

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringK.S.R. College of EngineeringTiruchengodeIndia

Personalised recommendations