Skip to main content

Prediction of Molecular Bioactivity for Drug Design Using a Decision Tree Algorithm

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2843))

Abstract

A machine learning-based approach to the prediction of molecular bioactivity in new drugs is proposed. Two important aspects are considered for the task: feature subset selection and cost-sensitive classification. These are to cope with the huge number of features and unbalanced samples in a dataset of drug candidates. We designed a pattern classifier with such capabilities based on information theory and re-sampling techniques. Experimental results demonstrate the feasibility of the proposed approach. In particular, the classification accuracy of our approach was higher than that of the winner of KDD Cup 2001 competition.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. Hatzis, David Page(2001). KDD-2001 Cup The Genomics Challenge (2001)

    Google Scholar 

  2. Gibas, C., Jambeck, P.: Developing Bioinformatics Computer Skills. O’Reilly, Sebastopol (2001)

    Google Scholar 

  3. Siedlecki, W., Sklansky, J.: On automatic feature selection. International Journal of Pattern Recognition 2, 197–220 (1988)

    Article  Google Scholar 

  4. Langley, P.: Selection of relevant features in machine learning. In: Proceedings of the AAAI Fall Symposium on Relevance, New Orleans, LA, pp. 1–5. AAAI Press, Menlo Park (1994)

    Google Scholar 

  5. Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1(3) (1997)

    Google Scholar 

  6. Yang, J., Honavar, V.: Feature Subset Selection Using A Genetic Algorithm. In: Proceedings of the GP 1997, Stanford, CA, pp. 380–385 (1997)

    Google Scholar 

  7. Nucciardi, A., Gose, E.: A comparison of seven techniques for choosing subsets of pattern recognition. IEEE Transactions on Computers 20, 1023–1031 (1971)

    Article  Google Scholar 

  8. Battiti, R.: Using Mutual Information for Selecting Features in Supervised Neural Net Learning. IEEE Transaction on Neural Networks 5(4), 537–550 (1994)

    Article  Google Scholar 

  9. Al-Ani, A., Deriche, M.: Feature selection using a mutual information based measure. In: Proceedings of 16th International Conference on Pattern Recognition, vol. 4, pp. 82–85 (2002)

    Google Scholar 

  10. Siedlecki, W., Sklansky, J.: A note on genetic algorithms for large-scale feature selection. IEEE Transactions on Computers 10, 335–347 (1989)

    MATH  Google Scholar 

  11. Brill, F., Brown, D., Martin, W.: Fast Genetic selection of features for neural network classifiers. IEEE Transactions on Neural Networks 3(2), 324–328 (1992)

    Article  Google Scholar 

  12. Richeldi, M., Lanzi, P.: Performing effective feature selection by investigating the deep structure of the data. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 379–383. AAAI Press, Menlo Park (1996)

    Google Scholar 

  13. Ng, A.Y.: Preventing “over-fitting” of cross-validation data. In: Proceedings of the 14th International Conference on Machine Learning (ICML), Nashvilli, TN, pp. 245–253 (1997)

    Google Scholar 

  14. Kohavi, R.: A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In: International Conference on Artificial Intelligence( IJCAI) (1995)

    Google Scholar 

  15. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley Interscience, Hoboken (2001)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, S., Yang, J., Oh, Kw. (2003). Prediction of Molecular Bioactivity for Drug Design Using a Decision Tree Algorithm. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds) Discovery Science. DS 2003. Lecture Notes in Computer Science(), vol 2843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39644-4_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39644-4_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20293-6

  • Online ISBN: 978-3-540-39644-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics