Enzyme Classification on DUD-E Database Using Logistic Regression Ensemble (Lorens)

Kuswanto, Heri; Melasasi, Jainap N.; Ohwada, Hayato

doi:10.1007/978-3-319-66984-7_6

Heri Kuswanto⁶,
Jainap N. Melasasi⁶ &
Hayato Ohwada⁷

Part of the book series: Studies in Computational Intelligence ((SCI,volume 741))

866 Accesses

Abstract

Discovery of drugs has been a complex process, time-consuming and expensive until an alternative of making drug has been found i.e. using in silico method to discover potential inhibitor. During the process of drug design, compound classification is carried out through docking score steps. The aim of this research is to predict the docking score results using proper methods for classification i.e. a computationally based method and a standard statistical method. This research examined three target enzymes listed in DUD-E database i.e. aofb, cah2 and hs90a. Each enzyme consists of different compounds that will be classified as good inhibitor (ligand) and bad inhibitor (decoy). In this research, the docking score step is conducted by binary logistic regression and logistic regression ensemble (Lorens). Binary logistic regression yields on 90.4% of accuracy for aofb, 91.7% for cah2 and 94% for hs90a enzyme. Meanwhile, logistic regression ensemble (Lorens) results on the accuracy levels of 88.95, 92.1 and 100% for aofb, cah2 and hs90a consecutively. This paper showed that logistic regression ensemble method outperforms standard logistic regression to be used for the inhibitor classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

DiMasi, J. A., Hansen, R. W., & Grabowski, H. G. (2003). The price of innovation: new estimates of drug development costs. Journal of Health Economics, 22(2), 151–185.
Article Google Scholar
Jenwitheesuk, E. H. (2008). Novel paradigms for drug discovery computational multitarget screening. Trends in Pharmacological Sciences, 29, 62–71.
Article Google Scholar
Okada, M., Ohwada, H., & Aoki, S. (2013). Docking score calculation using machine learning with an enhanced inhibitor database. Bioinformatics and Computational Biology, 1.
Google Scholar
Cortes, C., & Vapnik, V. (1995). Support vector networks. Machine Learning, 20, 273.
MATH Google Scholar
Lim, N., Ahn, H., Moon, H., & Chen, J. J. (2010). Classification high dimensional data with ensemble of logistic regression models. Journal of Biopharmaceutical Statistics, 20, 160–17.
Google Scholar
Pombo, N., Garcia, N., Bousson, K., & Felizardo, V. (2015). Machine learning approaches to automated medical decision support systems. In Handbook of Research on Artificial Intelligence Techniques and Algorithms. Chapter, 6, 183–203.
Google Scholar
Lim, N. (2007). Classification by ensembles from random partitions using logistic models. In Applied Mathematics and Statistics. Stony Brook University.
Google Scholar
Kuswanto, H., Asfihani, A., Sarumaha, Y., & Ohwada, H. (2015). Logistic regression ensemble for predicting customer defection with very large sample size. Procedia Computer Science, 72, 86–93.
Article Google Scholar
Lee, K., Ahn, H., Moon, H., Kodell, R. L., & Chen, J. J. (2013). Multinomial logistic regression ensembles. Journal of Biopharmaceutical Statistics, 23(3), 681–694.
Article MathSciNet Google Scholar
Hosmer, Watson D., & Lemeshow, S. (1995). Applied logistic regression. New York: Wiley.
MATH Google Scholar
Agresti, A. (1990). Categorical Data Analysis. New York: Wiley.
MATH Google Scholar
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.
Article MathSciNet Google Scholar
Witten, I. H., Frank, E., & Hall, M. A. (2011). Data mining: Practical machine learning tools and techniques (3rd ed.). Burlington: Morgan Kaufmann.
Google Scholar
Ahn, H., Moon, H., Fazzari, M. J., Lim, N., Chen, J. J., & Kodell, R. L. (2006). Classification by ensemble from random partitions of high-dimensional data. Computational Statistic and Data Analysis, 4–6.
Google Scholar

Download references

Acknowledgements

The authors gratefully acknowledge the financial support from The Ministry of Research, Technology and Higher Education Indonesia through Research Grant for International Collaboration and Scientific Publication. Moreover, the authors would like to thank also to the First EAI International Conference on Computer Science and Engineering, NOVEMBER 11–12, 2016, PENANG, MALAYSIA as well as the anonymous refrees.

Author information

Authors and Affiliations

Department of Statistics, Institut Teknologi Sepuluh Nopember, Kampus ITS Sukolilo, 60111, Surabaya, Indonesia
Heri Kuswanto & Jainap N. Melasasi
Department of Industrial Administration, Faculty of Science and Technology, Tokyo Universty of Science, Chiba, Japan
Hayato Ohwada

Authors

Heri Kuswanto
View author publications
You can also search for this author in PubMed Google Scholar
Jainap N. Melasasi
View author publications
You can also search for this author in PubMed Google Scholar
Hayato Ohwada
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heri Kuswanto .

Editor information

Editors and Affiliations

Department of Computer Science, Faculty of Electrical Engineering and Computer Science, VŠB-TU Ostrava, Ostrava-Poruba, Czech Republic
Ivan Zelinka
Faculty of Science and Information Technology, Universiti Teknologi PETRONAS, Teronoh, Perak, Malaysia
Pandian Vasant
Ton Duc Thang University, Ho Chi Minh, Vietnam
Vo Hoang Duy
Ton Duc Thang University, Ho Chi Minh, Vietnam
Tran Trong Dao

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kuswanto, H., Melasasi, J.N., Ohwada, H. (2018). Enzyme Classification on DUD-E Database Using Logistic Regression Ensemble (Lorens). In: Zelinka, I., Vasant, P., Duy, V., Dao, T. (eds) Innovative Computing, Optimization and Its Applications. Studies in Computational Intelligence, vol 741. Springer, Cham. https://doi.org/10.1007/978-3-319-66984-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-66984-7_6
Published: 22 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66983-0
Online ISBN: 978-3-319-66984-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics