Skip to main content

Big Data Machine Learning Framework for Drug Toxicity Prediction

  • Conference paper
  • First Online:
Microservices in Big Data Analytics

Abstract

The exposure of humans to toxic drug samples adversely affects lives of many beings. The drug molecule data is vast, complex and mostly unstructured. Big Data predictive analytics using machine learning techniques helps in analyzing such data and is currently an active area of research in biological computing. The objective of this research paper is to predict the toxicity of drug samples; here, the hot topic of the era comes into the role; machine learning plays a significant role in predicting toxicity of drug samples on the basis of various features of samples. The proposed framework predicts the toxicity of drug sample which can help in identifying adverse effects caused from it with an accuracy of 91.15% with random forest. The results are further optimized by building an ensemble of J48 and random forest, the two best performing classifiers on drug data. With a prediction accuracy of 96.20%, the results are compared with standard machine learning models like random forest, AdaBoost, Naive Bayes, etc., and are found to be much better than these classifiers. With the increase in toxicity in environment, this framework will play a significant role in improving lifestyle.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jick, H.: Adverse drug reactions: the magnitude of the problem. J. Allergy Clin. Immunol. 74.4, 555–557 (1984)

    Article  Google Scholar 

  2. Neltner, T.G., Kulkarni, N.R., Alger, H.M., Maffini, M.V., Bongard, E.D., Fortin, N.D., Olson, E.D.: Navigating the U.S. food additive regulatory program. Compr. Rev. Food Sci. Food Saf. 10, 342–368 (2011)

    Article  Google Scholar 

  3. Gandomi, Amir, Haider, Murtaza: Beyond the hype: Big data concepts, methods, and analytics. Int. J. Inf. Manage. 35(2), 137–144 (2015)

    Article  Google Scholar 

  4. Hooda, N.N. et al. Fraudulent firm classification: a case study of an external audit. Appl. Artif. Intell. 32.1, 48–64 (2018)

    Article  Google Scholar 

  5. Jain, G., Sharma, M., Agarwal, B.: Spam detection in social media using convolutional and long short term memory neural network. Ann. Math. Artif. Intell. Springer (2019). https://doi.org/10.1007/s10472-018-9612-z

    Article  MATH  Google Scholar 

  6. Bouckaert, Remco R. “Bayesian network classifiers in weka.” (2004)

    Google Scholar 

  7. Hooda, N. et al.: B 2 FSE framework for high dimensional imbalanced data: A case study for drug toxicity prediction. Neurocomputing (2018)

    Google Scholar 

  8. Mukherjee, Saurabh, Sharma, Neelam: Intrusion detection using naive Bayes classifier with feature reduction. Proc. Technol. 4, 119–128 (2012)

    Article  Google Scholar 

  9. Patil, T.R., Sherekar, S.S.: Performance analysis of Naive Bayes and J48 classification algorithm for data classification. Int. J. Comput. Sci. Appl. 6.2, 256–261 (2013)

    Google Scholar 

  10. Kudo, T., Maeda, E., Matsumoto, Y.: An application of boosting to graph classification. Advanc. Neural Informat. Process. Syst. (2005)

    Google Scholar 

  11. Martišius, I., Šidlauskas, K., Damaševičius, R.: Real-time training of voted perceptron for classification of EEG data. Int. J. Artif. Intell. (IJAI) 10(S13) (2013)

    Google Scholar 

  12. Sahoo, G., Yugal Kumar. Analysis of parametric & non parametric classifiers for classification technique using WEKA. Int. J. Informat. Technol. Comput. Sci. (IJITCS) 4.7, 43–49 (2012)

    Google Scholar 

  13. Townsend, J.T.: Theoretical analysis of an alphabetic confusion matrix. Percept. Psychophy. 9.1, 40–50 (1971)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sankalp Sharma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sharma, S., Hooda, N. (2020). Big Data Machine Learning Framework for Drug Toxicity Prediction. In: Chaudhary, A., Choudhary, C., Gupta, M., Lal, C., Badal, T. (eds) Microservices in Big Data Analytics. Springer, Singapore. https://doi.org/10.1007/978-981-15-0128-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-0128-9_11

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-0127-2

  • Online ISBN: 978-981-15-0128-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics