Skip to main content

Cytochrome P450 Classification of Drugs with Support Vector Machines Implementing the Nearest Point Algorithm

  • Conference paper
Knowledge Exploration in Life Science Informatics (KELSI 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3303))

Abstract

Cytochrome P450s are an important class of drug metabolizing enzymes which play a significant role in drug metabolism, and thus in the drug discovery process. With a data set that was compiled from public available data on cytochrome P450 drug interaction data, and derived calculated chemoinformatics data, we have built binary classifiers based on kernel methods, in particular support vector machines implementing the nearest point algorithm. Feature selection is used as a preliminary stage of supervised learning. We work on supervised as well as on unsupervised selection methods. The classification results from a selected subset of the test set are compared structurally with compounds from the training set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Rendic, S., Di Carlo, F.J.: Human Cytochrome P450 Enzymes: A status report summarizing their reactions, substrates, inducers and inhibitors. Drug Metabolism Reviews 29, 413–580 (1997)

    Article  Google Scholar 

  2. Sorich, M.J., McKinnon, R.A., Miners, J.O., Smith, P.A.: Comparison of linear and non-linear classification algorithms for the prediction of chemical metabolism by UDP-glucuronosyltransferase isoforms. J. Chem. Inf. Comput. Sci. 43, 2019–2024 (2003)

    Google Scholar 

  3. Kless, A., Eitrich, T., Meyer, W., Grotendorst, J.: Data Mining in Forschung und Entwicklung. Bioworld 2 (2004), http://www.bioworld.ch

  4. Zamora, I., Afzelius, L., Cruciani, G.: Predicting Drug Metabolism: A Site of Metabolism Prediction Tool Applied to the Cytochrome P450 2C9. J. Med. Chem. 46, 2313–2324 (2003); Susnow, R.G., Dixon, S.L.: Use of Robust Classification Techniques for the Prediction of Human Cytochrome P450 2D6 Inhibition. J. Chem. Inf.Comput. Sci. 43, 1308–1315 (2003); Singh, S.B., Shen, L.Q., Walker, M.J., Sheridan, R.P.: A model for predicting likely sites of CYP3A4-mediated metabolism on drug like molecules. J. Med. Chem. 46, 1330–1336 (2003)

    Article  Google Scholar 

  5. Flockhart, D.: Cytochrome P450 Drug Interaction Table, http://medicine.iupui.edu/flockhart

  6. MOE (The Molecular Operating Environment) Version, 03. Software available from Chemical Computing Group Inc., 1010 Sherbrooke Street West, Suite 910, Montreal, Canada H3A 2R7 (2004), http://www.chemcomp.com

  7. Randic, M.: On Molecular Identification Numbers. J. Chem. Inf. Comput. Sci. 24, 164–175 (1984)

    Google Scholar 

  8. Hall, L.H., Kier, L.B.: Electrotopological State Indices for Atom Types: A Novel Combination of Electronic, Topological, and Valence State Information. J. Chem. Inf. Comput. Sci. 35, 1039–1045 (1995); Hall, L.H., Kier, L.B.: The Molecular Connectivity Chi Indexes and Kappa Shape Indexes in Structure-Property Relations. Reviews of Computational Chemistry, 367–422 (1991); Hall, L.H., Kier, L.B.: The Nature of Structure-Activity Relationships and their Relation to Molecular Connectivity. Eur. J. Med. Chem. 12, 307–314 (1977)

    Google Scholar 

  9. Gasteiger, J., Rudolph, C., Sadowski, J.: Automatic generation of 3D-atomic coordinates for organic molecules. Tetrahedron Comput. Method. 3, 537–547 (1990)

    Article  Google Scholar 

  10. Gasteiger, J.: Empirical Methods for the Calculation of Physicochemical Data of Organic Compounds. In: Jochum, C., Hicks, M.G., Sunkel, J. (eds.) Physical Property Prediction in Organic Chemistry, pp. 119–138. Springer, Heidelberg (1988); Ihlenfeldt, W.D., Gasteiger, J.: All descriptors for ensembles and molecules. J. Comput. Chem. 8, 793–813 (1994), http://www.mol-net.de

    Article  Google Scholar 

  11. Ihlenfeldt, W.D., Takahashi, Y., Abe, H., Sasaki, S.: Computation and Management of Chemical Properties in CACTVS: An Extensible Networked Approach toward Modularity and Compatibility. J. Chem. Inf. Comput. Sci. 34, 109–116 (1994), http://www2.ccc.unierlangen.de/software/cactvs/index.html

    Google Scholar 

  12. Jolliffe, I.: Discarding variables in a principal component analysis. Journal of the Royal Statistical Society 21, 160–173 (1972)

    MathSciNet  Google Scholar 

  13. Byvatov, E., Schneider, G.: SVM-Based Feature Selection for Characterization of Focused Compound Collections. J. Chem. Inf. Comput. Sci. 44, 993–999 (2004)

    Google Scholar 

  14. Wegner, J.K., Froehlich, H., Zell, A.: Feature selection for descriptor based classification models (1. Theory and GA-SEC algorithm (921–930); 2. Human intestinal absorption (HIA) (931–939)). J. Chem. Inf. Comput. Sci. 44 (2004)

    Google Scholar 

  15. McCabe, G.: Principal variables. Technometrics 26(2), 137–144 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  16. Kotz, S., Johnson, N.L. (eds.): Encyclopedia of Statistical Sciences, vol. 3. John Wiley & Sons, Chichester (1983)

    Google Scholar 

  17. Jain, A., Murty, M., Flynn, P.: Data Clustering: A Review. ACM Computing Surveys 31(3), 264–323 (1999)

    Article  Google Scholar 

  18. Jolliffe, I.: Principal Component Analysis. Springer-Verlag, New York (1986)

    Google Scholar 

  19. Korff, M., Steger, M.: GPCR-Tailored Pharmacophore Pattern Recognition of Small Molecular Ligands. J. Chem. Inf. Comput. Sci. 44, 1137–1147 (2004)

    Google Scholar 

  20. Schoelkopf, B., Smola, A.: Learning with kernels. MIT Press, Cambridge (2002)

    Google Scholar 

  21. Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Data Mining Researchers. Intelligent Enterprise Technologies Laboratory, HP Laboratories Palo Alto (2003)

    Google Scholar 

  22. Cristianini, N., Shawe-Taylor, J.: An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  23. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  24. Vapnik, V.: Statistical Learning Theory. John Wiley & Sons, New York (1998)

    MATH  Google Scholar 

  25. Warmuth, M.K., Liao, J., Raetsch, G., Mathieson, M., Putta, S., Lemmen, C.: Active Learning with Support Vector Machines in the Drug Discovery Process. J. Chem. Inf. Comput. Sci. 43, 667–673 (2003)

    Google Scholar 

  26. Byvatov, E., Fechner, U., Sadowski, J., Schneider, G.: Comparison of Support Vector Machine and Artificial Neural Network System for Drug/Nondrug Classification. J. Chem. Inf. Comput. Sci. 43, 1882–1889 (2003)

    Google Scholar 

  27. Platt, J.: Sequential Minimal Optimization: A fast algorithm for training support vector machines. Microsoft Research Technical Report MSR-TR-98-14 (1998)

    Google Scholar 

  28. Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to Platt’s SMO algorithm for SVM classifier design. Technical Report CD-99-14, National University of Singapore (1999)

    Google Scholar 

  29. Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: A fast iterative nearest point algorithm for support vector machine classifier design. Technical Report TR-ISL-99-03, Indian Institute of Science (1999)

    Google Scholar 

  30. Gilbert, E.G.: Minimizing the quadratic form on a convex set. SIAM Journal on Control 4, 61–79 (1966); Mitchell, B.F., Dem’yanov, V.F., Malozemov, V.N.: Finding the point of a polyhedron closest to the origin. SIAM Journal on Control 12(1), 19–26 (1974)

    Article  MATH  MathSciNet  Google Scholar 

  31. Chang, C.-C., Hsu, C.-W., Lin, C.-J.: The analysis of decomposition methods for support vector machines. IEEE Transactions on Neural Networks 11(4), 1003–1008 (2000)

    Article  Google Scholar 

  32. SVM light: Implementation of Support Vector Machines. Software available from Thorsten Joachims, Cornell University, http://svmlight.joachims.org/

  33. Lind, P., Maltseva, T.: Support Vector Machines for the Estimation of Aqueous Solubility. J. Chem. Inf. Comput. Sci. 43, 1855–1859 (2003)

    Google Scholar 

  34. Markowetz, F.: Support Vector Machines in Bioinformatics. Diploma thesis, University of Heidelberg (2002)

    Google Scholar 

  35. Xue, C.X., Zhang, R.S., Liu, M.C., Hu, Z.D., Fan, B.T.: Study of the Quantitative Structure-Mobility Relationship of Carboxylic Acids in Capillary Electrophoresis Based on Support Vector Machines. J. Chem. Inf. Comput. Sci. 44, 950–957 (2004)

    Google Scholar 

  36. Van Rijsberger, C.J.: Information Retrieval, Butterworths (1979)

    Google Scholar 

  37. Gasteiger, J., Marsili, M.: Iterative Partial Equalization of Orbital Electronegativity - A Rapid Access to Atomic Charges. Tetrahedron 36, 3219–3228 (1980)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kless, A., Eitrich, T. (2004). Cytochrome P450 Classification of Drugs with Support Vector Machines Implementing the Nearest Point Algorithm. In: López, J.A., Benfenati, E., Dubitzky, W. (eds) Knowledge Exploration in Life Science Informatics. KELSI 2004. Lecture Notes in Computer Science(), vol 3303. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30478-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30478-4_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23927-7

  • Online ISBN: 978-3-540-30478-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics