Abstract
Drug discovery is an important step before drug development. Drug discovery is the process of identifying, testing a drug before medical use. Drugs are used to cure diseases by interacting with the target, which is the protein in the human cells. Many resources are wasted (cost and time) on lab experiments to discover drugs and its application. Yet machine learning enhanced the process of drug discovery and the prediction of drug-target interaction, which helped in predicting new drugs and finding more applications for old drugs. Predicting drug-target interaction starting by studying the nature of drugs and its properties. Most of the datasets existing are drugs, targets and their interactions datasets. We compiled our dataset to include side effect as drug feature. The dataset contains 400 drugs, 794 targets and 3990 side effects. In this study, a machine-learning model is implemented using three different classifiers: Decision Tree, Random Forest (RF) and K-Nearest Neighbors (K-NN) for classification. Drug fingerprint and side effect were used as input features to train our model. Three different experiments were conducted using fingerprint, side effect and both fingerprint and side effect. Results showed improvement in prediction when integrating both drug fingerprint and side effect. K-NN scored best results in the three experiment with an average accuracy of 94.69%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shi, J.-Y., Yiu, S.-M., Li, Y., Leung, H.C.M., Chin, F.Y.L.: Predicting drug–target interaction for new drugs using enhanced similarity measures and super-target clustering. Methods 83, 98–104 (2015)
Lan, C., Chandrasekarany, S., Huan, J.: A distributed and privatized framework for drug-target interaction prediction. In: International Conference on Bioinformatics and Biomedicine (BIBM), pp. 731–734. IEEE (2016)
Statistics: DrugBank. https://www.drugbank.ca/stats. Accessed Nov 2018
Bolton, E., Wang, Y., Thiessen, P., Bryant, S.: PubChem: integrated platform of small molecules and biological activities. Ann. Rep. Comput. Chem. 4, 217–241 (2008)
Hurle, M., Yang, L., Xie, Q., Rajpal, D., Sanseau, P., Agarwal, P.: Computational drug repositioning: from data to therapeutics. Clin. Pharmacol. Ther. 93(4), 335–341 (2013)
Chen, X., Yan, C., Zhang, X., Zhang, X., Dai, F., Yin, J., Zhang, Y.: Drug–target interaction prediction: databases, web servers and computational models. Briefings Bioinf. 17(4), 696–712 (2015)
Li, H., Gao, Z., Kang, L., Zhang, H., Yang, K., Yu, K., Luo, X., Zhu, W., Chen, K., Shen, J., Wang, X., Jiang, H.: TarFisDock: a web server for identifying drug targets with docking approach. Nucleic Acids Res. 34(Web Server), W219–W224 (2006)
Kanehisa, M.: From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 34(90001), D354–D357 (2006)
Schomburg, I.: BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Res. 32(90001), 431D–433D (2004)
Kuhn, M., Szklarczyk, D., Franceschini, A., Mering, C., Jensen, L., Bork, P.: STITCH 3: zooming in on protein-chemical interactions. Nucleic Acids Res. 40(D1), D876–D880 (2011)
Wishart, D.S., Knox, C., Guo, A.C., Cheng, D., Shrivastava, S., Tzur, D., Gautam, B., Hassanali, M.: DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 36(suppl_1), D901–D906 (2007)
Coelho, E., Oliveira, J., Arrais, J.: Ensemble-based methodology for the prediction of drug-target interactions. In: 29th International Symposium on Computer-Based Medical Systems (CBMS), pp. 36–41. IEEE (2016)
Wishart, D.S.: DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 34(90001), D668–D672 (2006)
Yamanishi, Y., Kotera, M., Kanehisa, M., Goto, S.: Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework. Bioinformatics 26(12), i246–i254 (2010)
Galeano, D., Paccanaro, A.: Drug targets prediction using chemical similarity. In: XLII Latin American Computing Conference (CLEI), pp. 1–7. IEEE (2016)
Stark, C.: BioGRID: a general repository for interaction datasets. Nucleic acids Res. 34(suppl 1), D535–D539 (2006)
Hao, M., Bryant, S., Wang, Y.: Predicting drug-target interactions by dual-network integrated logistic matrix factorization. Sci. Rep. 7(1), 40376 (2017)
Sinha, A., Singh, P., Prakash, A., Pal, D., Dube, A., Kumar, A.: Putative drug and vaccine target identification in leishmania donovani membrane proteins using naïve bayes probabilistic classifier. IEEE/ACM Trans. Comput. Biol. Bioinform. 14, 204–211 (2017)
Kumar, A., Misra, P., Sisodia, B., Shasany, A., Sundar, S., Dube, A.: Proteomic analyses of membrane enriched proteins of Leishmania donovani Indian clinical isolate by mass spectrometry. Parasitol. Int. 64(4), 36–42 (2015)
Li, Z., Han, P., You, Z.-H., Li, X., Zhang, Y., Yu, H., Nie, R., Chen, X.: In silico prediction of drug-target interaction networks based on drug chemical structure and protein sequences. Sci. Rep. 7(1) (2017)
Gunther, S., Kuhn, M., Dunkel, M., Campillos, M., Senger, C., Petsalaki, E., Ahmed, J., Urdiales, E.G., Gewiess, A., Jensen, L.J., Schneider, R., Skoblo, R., Russell, R.B., Bourne, P.E., Bork, P., Preissner, R.: SuperTarget and matador: resources for exploring drug-target relationships. Nucleic Acids Res. 36(Database), D919–D922 (2007)
Azuaje, F., Zhang, L., Devaux, Y., Wagner, D.: Drug-target network in myocardial infarction reveals multiple side effects of unrelated drugs. Sci. Rep. 1(1), 52 (2011)
Cao, D.-S., Liu, S., Xu, Q.-S., Lu, H.-M., Huang, J.-H., Hu, Q.-N., Liang, Y.-Z.: Large-scale prediction of drug–target interactions using protein sequences and drug topological structures. Anal. Chim. Acta 752, 1–10 (2012)
Cao, D.-S., Hu, Q.-N., Xu, Q.-S., Yang, Y.-N., Zhao, J.-C., Lu, H.-M., Zhang, L.-X., Liang, Y.-Z.: In silico classification of human maximum recommended daily dose based on modified random forest and substructure fingerprint. Anal. Chim. Acta 692(1–2), 50–56 (2011)
Campillos, M., Kuhn, M., Gavin, A., Jensen, L., Bork, P.: Drug target identification using side-effect similarity. Science 321(5886), 263–266 (2008)
Fayz, S., Rizka, M., Maghraby, F.: Cervical cancer diagnosis using random forest classifier with SMOTE and feature reduction techniques. IEEE Access 1 (2018)
Wu, Y., Wang, H., Wu, F.: Automatic classification of pulmonary tuberculosis and sarcoidosis based on random forest. In: 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI). IEEE (2017)
Bombara, G., Vasile, C.-I., Penedo, F., Yasuoka, H., Beltaz, C.: A decision tree approach to data classification using signal temporal logic. In: Proceedings of the 19th International Conference on Hybrid Systems: Computation and Control - HSCC 2016 (2016)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Saad, A., Maghraby, F.A., Omar, Y.M. (2020). Predicting Drug Target Interaction by Integrating Drug Fingerprint and Drug Side Effect Using Machine Learning. In: Hassanien, A., Azar, A., Gaber, T., Bhatnagar, R., F. Tolba, M. (eds) The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2019). AMLTA 2019. Advances in Intelligent Systems and Computing, vol 921. Springer, Cham. https://doi.org/10.1007/978-3-030-14118-9_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-14118-9_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14117-2
Online ISBN: 978-3-030-14118-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)