Abstract
The exposure of humans to toxic drug samples adversely affects lives of many beings. The drug molecule data is vast, complex and mostly unstructured. Big Data predictive analytics using machine learning techniques helps in analyzing such data and is currently an active area of research in biological computing. The objective of this research paper is to predict the toxicity of drug samples; here, the hot topic of the era comes into the role; machine learning plays a significant role in predicting toxicity of drug samples on the basis of various features of samples. The proposed framework predicts the toxicity of drug sample which can help in identifying adverse effects caused from it with an accuracy of 91.15% with random forest. The results are further optimized by building an ensemble of J48 and random forest, the two best performing classifiers on drug data. With a prediction accuracy of 96.20%, the results are compared with standard machine learning models like random forest, AdaBoost, Naive Bayes, etc., and are found to be much better than these classifiers. With the increase in toxicity in environment, this framework will play a significant role in improving lifestyle.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jick, H.: Adverse drug reactions: the magnitude of the problem. J. Allergy Clin. Immunol. 74.4, 555–557 (1984)
Neltner, T.G., Kulkarni, N.R., Alger, H.M., Maffini, M.V., Bongard, E.D., Fortin, N.D., Olson, E.D.: Navigating the U.S. food additive regulatory program. Compr. Rev. Food Sci. Food Saf. 10, 342–368 (2011)
Gandomi, Amir, Haider, Murtaza: Beyond the hype: Big data concepts, methods, and analytics. Int. J. Inf. Manage. 35(2), 137–144 (2015)
Hooda, N.N. et al. Fraudulent firm classification: a case study of an external audit. Appl. Artif. Intell. 32.1, 48–64 (2018)
Jain, G., Sharma, M., Agarwal, B.: Spam detection in social media using convolutional and long short term memory neural network. Ann. Math. Artif. Intell. Springer (2019). https://doi.org/10.1007/s10472-018-9612-z
Bouckaert, Remco R. “Bayesian network classifiers in weka.” (2004)
Hooda, N. et al.: B 2 FSE framework for high dimensional imbalanced data: A case study for drug toxicity prediction. Neurocomputing (2018)
Mukherjee, Saurabh, Sharma, Neelam: Intrusion detection using naive Bayes classifier with feature reduction. Proc. Technol. 4, 119–128 (2012)
Patil, T.R., Sherekar, S.S.: Performance analysis of Naive Bayes and J48 classification algorithm for data classification. Int. J. Comput. Sci. Appl. 6.2, 256–261 (2013)
Kudo, T., Maeda, E., Matsumoto, Y.: An application of boosting to graph classification. Advanc. Neural Informat. Process. Syst. (2005)
Martišius, I., Šidlauskas, K., Damaševičius, R.: Real-time training of voted perceptron for classification of EEG data. Int. J. Artif. Intell. (IJAI) 10(S13) (2013)
Sahoo, G., Yugal Kumar. Analysis of parametric & non parametric classifiers for classification technique using WEKA. Int. J. Informat. Technol. Comput. Sci. (IJITCS) 4.7, 43–49 (2012)
Townsend, J.T.: Theoretical analysis of an alphabetic confusion matrix. Percept. Psychophy. 9.1, 40–50 (1971)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sharma, S., Hooda, N. (2020). Big Data Machine Learning Framework for Drug Toxicity Prediction. In: Chaudhary, A., Choudhary, C., Gupta, M., Lal, C., Badal, T. (eds) Microservices in Big Data Analytics. Springer, Singapore. https://doi.org/10.1007/978-981-15-0128-9_11
Download citation
DOI: https://doi.org/10.1007/978-981-15-0128-9_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0127-2
Online ISBN: 978-981-15-0128-9
eBook Packages: Computer ScienceComputer Science (R0)