Advertisement

Random forest for big data classification in the internet of things using optimal features

  • S. K. Lakshmanaprabu
  • K. Shankar
  • M. Ilayaraja
  • Abdul Wahid Nasir
  • V. Vijayakumar
  • Naveen ChilamkurtiEmail author
Original Article
  • 52 Downloads

Abstract

The internet of things (IoT) is an internet among things through advanced communication without human’s operation. The effective use of data classification in IoT to find new and hidden truth can enhance the medical field. In this paper, the big data analytics on IoT based healthcare system is developed using the Random Forest Classifier (RFC) and MapReduce process. The e-health data are collected from the patients who suffered from different diseases is considered for analysis. The optimal attributes are chosen by using Improved Dragonfly Algorithm (IDA) from the database for the better classification. Finally, RFC classifier is used to classify the e-health data with the help of optimal features. It is observed from the implementation results is that the maximum precision of the proposed technique is 94.2%. In order to verify the effectiveness of the proposed method, the different performance measures are analyzed and compared with existing methods.

Keywords

Internet of things Big data E-health Map reduce Random forest classifier Dragonfly algorithm Optimization 

List of symbols

\(S{p_i}\)

Separation of \(i\)th individual

\(P\)

Current position

\({P_k}\)

Position of \(k\)th individual

\(N\)

Total number of neighboring individual in the search space

\({A_{li}}\)

Alignment of \(i\)th neighboring individual

\({V_k}\)

Velocity of \(k\)th individual

\({P^ - }\)

Position of enemy

\({P^+}\)

Position of food source

\(sw\)

Separation weight

\(aw\)

Alignment weight

\(cw\)

Cohesion weight

\(Att\)

Attraction, food factor

\(Dis\)

Distraction, enemy factor

\(w\_CR\)

Inertia weight-crossover rate

\(t\)

Iteration count

\({f_{\text{max} }}\)

Largest fitness value

\({f_p}\)

Larger of the two individuals to cross the fitness

\({f_{avg}}\)

Average fitness

\({f_{}}\)

Mutation individual’s fitness

\({R_1},{R_2}\)

Random values

\(V1,V2\)

Random vectors that indicate the probability

\(F\)

Margin function

\(I(\,)\)

Indicator function

\({\arg _k}I({h_k}(V1)\)

\({h_k}\) is \(n\)th tree of the RF

Notes

References

  1. 1.
    Bin S, Yuan L, Xiaoyi W (2010) Research on data mining models for the internet of things. In: Image analysis and signal processing (IASP), 2010 international conference on, IEEE, pp 127–132Google Scholar
  2. 2.
    Paul A, Daniel A, Ahmad A, Rho S (2017) Cooperative cognitive intelligence for the internet of vehicles. IEEE Syst J 11(3):1249–1258CrossRefGoogle Scholar
  3. 3.
    Singh A, Sharma S, 2017, February. Analysis of data mining models for internet of things. In: I-SMAC (IoT in social, mobile, analytics, and cloud) (I-SMAC), 2017 international conference on, IEEE, pp 94–100Google Scholar
  4. 4.
    Yan Z, Liu J, Yang LT, Chawla N (2017) Big data fusion in internet of things. Inf Fusion.  https://doi.org/10.1016/j.inffus.2017.04.005 Google Scholar
  5. 5.
    Paul A (2013) Graph-based M2M optimization in an IoT environment. In: Proceedings of the 2013 research in adaptive and convergent systems, ACM, pp 45–46Google Scholar
  6. 6.
    Warner JL, Zhang P, Liu J, Alterovitz G (2016) Classification of hospital-acquired complications using temporal clinical information from a large electronic health record. J Biomed Inform 59:209–217CrossRefGoogle Scholar
  7. 7.
    Ahmed E, Yaqoob I, Hashem IAT, Khan I, Ahmed AIA, Imran M, Vasilakos AV (2017) The role of big data analytics in the Internet of Things. Comput Netw 129:459–471CrossRefGoogle Scholar
  8. 8.
    Plageras AP, Stergiou C, Kokkonis G, Psannis KE, Ishibashi Y, Kim BG, Gupta BB (2017) Efficient large-scale medical data (eHealth Big Data) analytics in the internet of things. In: Business informatics (CBI), 2017 IEEE 19th conference on, IEEE, vol 2, pp 21–27Google Scholar
  9. 9.
    Sugiyarti E, Jasmi KA, Basiron B, Huda M, Shankar K, Maseleno A (2018) Decision support system of scholarship grantee selection using data mining. Int J Pure Appl Math 119(15):2239–2249Google Scholar
  10. 10.
    Susto GA, Schirru A, Pampuri S, McLoone S (2016) Supervised aggregative feature extraction for big data time series regression. IEEE Trans Ind Inform 12(3):1243–1252CrossRefGoogle Scholar
  11. 11.
    Masetic Z, Subasi A (2016) Congestive heart failure detection using a random forest classifier. Comput Methods Prog Biomed 130:54–64CrossRefGoogle Scholar
  12. 12.
    Revathi L, Appandiraj A (2017) Hadoop based parallel framework for feature subset selection in big data. J Innov Res Sci Eng Technol 4(5):3530–3534Google Scholar
  13. 13.
    Shankar K (2017) Prediction of most risk factors in hepatitis disease using Apriori algorithm. Res J Pharm Biol Chem Sci 8(5):477–484. ISSN 0975-8585Google Scholar
  14. 14.
    Mohapatra C, Rautray SS, Pandey M (2017) Prevention of infectious disease based on big data analytics and map-reduce. In: Electrical, computer and communication technologies (ICECCT), 2017 second international conference on, IEEE, pp 1–4Google Scholar
  15. 15.
    Lakshmanaprabu SK, Shankar K, Khanna A, Gupta D, Rodrigues JJ, Pinheiro PR, De Albuquerque VHC (2018) Effective features to classify big data using social internet of things. IEEE Access 6:24196–24204CrossRefGoogle Scholar
  16. 16.
    Shankar K, Lakshmanaprabu SK, Gupta D et al (2018) Optimal feature-based multi-kernel SVM approach for thyroid disease classification. J Super Comput.  https://doi.org/10.1007/s11227-018-2469-4 Google Scholar
  17. 17.
    Manogaran G, Lopez D, Chilamkurti N (2018) In-Mapper combiner based MapReduce algorithm for processing of big climate data. Future Gener Comput Syst 86:433–445CrossRefGoogle Scholar
  18. 18.
    Ke Q, Zhang J, Song H, Wan Y (2018) Big data analytics enabled by feature extraction based on partial independence. Neurocomputing 288:3–10CrossRefGoogle Scholar
  19. 19.
    Sindhujaa N, Vanitha CN, Subaira AS (2016) An improved version of big data classification and clustering using graph search technique. Int J Comput Sci Mob Comput 5(2):224–229Google Scholar
  20. 20.
    Wang F, Niu L (2016) An improved BP neural network in the internet of things data classification application research. In: Information technology, networking, electronic, and automation control conference, IEEE, pp 805–808Google Scholar
  21. 21.
    Paul A, Ahmad A, Rathore MM, Jabbar S (2016) Smartbuddy: defining human behaviors using big data analytics in the social internet of things. IEEE Wirel Commun 23(5):68–74CrossRefGoogle Scholar
  22. 22.
    Ravichandran K, Nagarasan S (2016) Performance of classification in medical data mining. J Innov Res Comput Commun Eng 4(6):12104–12110Google Scholar
  23. 23.
    Paul A, Rho S (2016) A probabilistic model for M2M in IoT networking and communication. Telecommun Syst 62(1):59–66CrossRefGoogle Scholar
  24. 24.
    Sisiaridis D, Markowitch O (2017) Feature extraction and feature selection: reducing data complexity with apache spark. Int J Netw Secur Appl 9(6):39–51Google Scholar
  25. 25.
    Antunes M, Gomes D, Aguiar RL (2018) Towards IoT data classification through semantic features. Future Gener Comput Syst 86:792–798CrossRefGoogle Scholar
  26. 26.
    Shadroo S, Rahmani AM (2018) Systematic survey of big data and data mining in the internet of things. Comput Netw 139:19–47CrossRefGoogle Scholar
  27. 27.
    Amroun H, Temkit MHH, Ammi M (2017) Best feature for CNN classification of human activity using IOT network. In: The internet of things (iThings) and IEEE green computing and communications (GreenCom) and IEEE cyber, physical and social computing (CPSCom) and IEEE smart data (SmartData), 2017 IEEE international conference on, IEEE, pp 943–950Google Scholar
  28. 28.
    Girish KV, Ramakrishnan AG, Kumar N (2018( A system for distributed audio classification using sparse representation over cloud for IOT. In: Communication systems & networks (COMSNETS), 2018 10th international conference on, IEEE, pp 342–347Google Scholar
  29. 29.
    Paul A (2014) Real-time power management for embedded M2M using intelligent learning methods. ACM Trans Embed Comput Syst (TECS) 13(5s):148Google Scholar
  30. 30.
    Sree Ranjini KS, Murugan S (2017) Memory-based hybrid dragonfly algorithm for numerical optimization problems. Expert Syst Appl 83:63–78CrossRefGoogle Scholar
  31. 31.
    Chaudhary A, Kolhe S, Kamal R (2016) An improved random forest classifier for multi-class classification. Inf Process Agric 3(4):215–222Google Scholar
  32. 32.
    Subramaniyaswamy V, Vijayakumar V, Logesh R, Indragandhi V (2015) Unstructured data analysis on big data using map reduce. Procedia Comput Sci 50:456–465CrossRefGoogle Scholar
  33. 33.
    Yang S, Guo JZ, Jin JW (2018) An improved Id3 algorithm for medical data classification. Comput Electr Eng 65:474–487CrossRefGoogle Scholar
  34. 34.
    Tran CT, Zhang M, Andreae P, Xue B, Bui LT (2018) An effective and efficient approach to classification with incomplete data. Knowl Based Syst 154:1–16CrossRefGoogle Scholar
  35. 35.
    Talari S, Shafie-khah M, Siano P, Loia V, Tommasetti A, Catalão JP (2017) A review of smart cities based on the internet of things concept. Energies 10(4):421CrossRefGoogle Scholar
  36. 36.
    Ayma VA, Ferreira RS, Happ P, Oliveira D, Feitosa R, Costa G, Plaza A, Gamba P (2015) Classification algorithms for big data analysis, a map reduce approach. Int Arch Photogramm Remote Sens Spat Inf Sci 40(3):17CrossRefGoogle Scholar
  37. 37.
    Harris NL, Jaffe ES, Stein H, Banks PM, Chan JK, Cleary ML, Delsol G, De Wolf-Peeters C, Falini B, Gatter KC, Grogan TM (1994) A revised European–American classification of lymphoid neoplasms: a proposal from the International Lymphoma Study Group. Blood 84(5):1361–1392Google Scholar
  38. 38.
  39. 39.
  40. 40.
  41. 41.

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Electronics and Instrumentation EngineeringB.S. Abdur Rahman Crescent Institute of Science and TechnologyChennaiIndia
  2. 2.School of ComputingKalasalingam Academy of Research and EducationKrishnankoilIndia
  3. 3.Electronics and Instrumentation EngineeringBannari Amman Institute of TechnologySathyamangalamIndia
  4. 4.School of Computing Science and EngineeringVellore Institute of TechnologyChennaiIndia
  5. 5.Cyber Security Program Coordinator, Computer Science and ITLa Trobe UniversityMelbourneAustralia

Personalised recommendations