Skip to main content

Linear and Kernel Model Construction Methods for Predicting Drug–Target Interactions in a Chemogenomic Framework

  • Protocol
  • First Online:
Computational Chemogenomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1825))

Abstract

Identification of drug–target interactions is a crucial process in drug discovery. In this chapter, we present protocols for recent advancements in machine learning methods for predicting drug–target interactions from heterogeneous biological data in a chemogenomic framework, in which prediction is based on the chemical structure data of drug candidate compounds and translated genomic sequence data of target candidate proteins. Most existing methods are based on either linear modeling or kernel modeling. To illustrate linear modeling, we introduce sparsity-induced binary classifiers and sparse canonical correlation analysis. To illustrate kernel modeling, we introduce pairwise kernel-based support vector machines and kernel-based distance learning. Workflows for using these techniques are presented. We also discuss the characteristics of each method and suggest some directions for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wang Y, Xiao J, Suzek T, Zhang J, Wang J, Bryant S (2009) PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Res 37:D623–D633

    Article  Google Scholar 

  2. Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, Katayama T, Kawashima S, Okuda S, Tokimatsu T, Yamanishi Y (2008) KEGG for linking genomes to life and the environment. Nucleic Acids Res 36(Database issue):D480–D485

    CAS  PubMed  Google Scholar 

  3. Gunther S, Guenther S, Kuhn M, Dunkel M et al (2008) Supertarget and matador: resources for exploring drug-target relationships. Nucleic Acids Res 36:D919–D922

    Article  Google Scholar 

  4. Wishart D, Knox C, Guo A, Cheng D, Shrivastava S, Tzur D, Gautam B, Hassanali M (2008) DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res 36:D901–D906

    Article  CAS  Google Scholar 

  5. Butina D, Segall M, Frankcombe K (2002) Predicting ADME properties in silico: methods and models. Drug Discov Today 7:S83–S88

    Article  CAS  Google Scholar 

  6. Byvatov E, Fechner U, Sadowski J, Schneider G (2003) Comparison of support vector machine and artificial neural network systems for drug/nondrug classication. J Chem Inf Comput Sci 43:1882–1889

    Article  CAS  Google Scholar 

  7. Rarey M, Kramer B, Lengauer T, Klebe G (1996) A fast flexible dockingmethod using an incremental construction algorithm. J Mol Biol 261:470–489

    Article  CAS  Google Scholar 

  8. Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita K, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 34:D354–D357

    Article  CAS  Google Scholar 

  9. Stockwell B (2000) Chemical genetics: ligand-based discovery of gene function. Nat Rev Genet 1:116–125

    Article  CAS  Google Scholar 

  10. Dobson C (2004) Chemical space and biology. Nature 432:824–828

    Article  CAS  Google Scholar 

  11. Nagamine N, Sakakibara Y (2007) Statistical prediction of protein-chemical interactions based on chemical structure and mass spectrometry data. Bioinformatics 23:2004–2012

    Article  CAS  Google Scholar 

  12. Faulon J, Misra M, Martin S, Sale K, Sapra R (2008) Genome scale enzyme-metabolite and drugtarget interaction predictions using the signature molecular descriptor. Bioinformatics 24:225–233

    Article  CAS  Google Scholar 

  13. Jacob L, Vert J-P (2008) Protein-ligand interaction prediction: an improved chemogenomics approach. Bioinformatics 24:2149–2156

    Article  CAS  Google Scholar 

  14. Yabuuchi H, Niijima S, Takematsu H, Ida T, Hirokawa T, Hara T, Ogawa T, Minowa Y, Tsujimoto G, Okuno Y (2011) Analysis of multiple compound-protein interactions reveals novel bioactive molecules. Mol Syst Biol 7:472

    Article  CAS  Google Scholar 

  15. Tabei Y, Pauwels E, Stoven V, Takemoto K, Yamanishi Y (2012) Identification of chemogenomic features from drug-target interaction networks using interpretable classifiers. Bioinformatics 28:i487–i494

    Article  CAS  Google Scholar 

  16. Tabei Y, Yamanishi Y (2013) Scalable prediction of compound-protein interactions using minwise hashing. BMC Syst Biol 7(Suppl 6):S3

    Article  Google Scholar 

  17. Keiser MJ, Setola V, Irwin JJ, Laggner C, Abbas AI, Hufeisen SJ, Jensen NH, Kuijer MB, Matos RC, Tran TB et al (2009) Predicting new molecular targets for known drugs. Nature 462:175–181

    Article  CAS  Google Scholar 

  18. Bleakley K, Yamanishi Y (2009) Supervised prediction of drug-target interactions using bipartite local models. Bioinformatics 25:2397–2403

    Article  CAS  Google Scholar 

  19. Yamanishi Y, Araki M, Gutteridge A, Honda W, Kanehisa M (2008) Prediction of drug-target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 24:i232–i240

    Article  CAS  Google Scholar 

  20. Yamanishi Y (2009) Supervised bipartite graph inference. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds) Adv. neural inform. process. syst. 21. MIT Press, Cambridge, MA, pp 1841–1848

    Google Scholar 

  21. Yamanishi Y, Pauwels E, Saigo H, Stoven V (2011) Extracting sets of chemical substructures and protein domains governing drug-target interactions. J Chem Inf Model 51:1183–1194

    Article  CAS  Google Scholar 

  22. Todeschini R, Consonni V (2002) Handbook of molecular descriptors. Wiley-VCH, New York

    Google Scholar 

  23. Rogers D, Hahn M (2010) Extended-Connectivity Fingerprints. J Chem Inf Model 50:742–754

    Article  CAS  Google Scholar 

  24. Hall LH, Kier LB (1995) Electrotopological state indices for atom types: a novel combination of electronic, topological, and valence state information. J Chem Inf Comput Sci 1995(35):1039–1045

    Article  Google Scholar 

  25. Steinbeck C, Hoppe C, Kuhn S, Floris M, Guha R, Willighagen EL (2006) Recent developments of the chemistry development kit (CDK) - an open-source java library for chemo- and bioinformatics. Curr Pharm Des 12:2111–2120

    Article  CAS  Google Scholar 

  26. Klekota J, Roth FP (2008) Chemical substructures that enrich for biological activity. Bioinformatics 24:2518–2525

    Article  CAS  Google Scholar 

  27. Durant JL, Leland BA, Henry DR, Nourse JG (2002) Reoptimization of MDL keys for use in drug discovery. J Chem Inf Comput Sci 42:1273–1280

    Article  CAS  Google Scholar 

  28. Chen B, Wild D, Guha R (2009) PubChem as a source of polypharmacology. J Chem Inf Model 49:2044–2055

    Article  CAS  Google Scholar 

  29. Kotera M, Tabei Y, Yamanishi Y, Moriya Y, Tokimatsu T, Kanehisa M, Goto S (2013) KCF-S: KEGG chemical function and substructure for improved interpretability and prediction in chemical bioinformatics. BMC Syst Biol 7(Suppl 6):S2

    Article  Google Scholar 

  30. Hattori M, Okuno Y, Goto S, Kanehisa M (2003) Development of a chemical structure comparison method for integrated analysis of chemical and genomic information in the metabolic pathways. J Am Chem Soc 125:11853–11865

    Article  CAS  Google Scholar 

  31. Finn R, Tate J, Mistry J, Coggill P, Sammut J, Hotz H, Ceric G, Forslund K, Eddy S, Sonnhammer E, Bateman A (2012) The Pfam protein families database. Nucleic Acids Res 36:D281–D288

    Article  Google Scholar 

  32. Smith T, Waterman M (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197

    Article  CAS  Google Scholar 

  33. Saigo H, Vert J, Ueda N, Akutsu T (2004) Protein homology detection using stringalignment kernels. Bioinformatics 20:1682–1689

    Article  CAS  Google Scholar 

  34. Yildirim M, Goh K, Cusick M, Barabasi A, Vidal M (2007) Drug-target network. Nat Biotechnol 25:1119–1126

    Article  CAS  Google Scholar 

  35. Schölkopf B, Tsuda K, Vert J (2004) Kernel methods in computational biology. MIT Press, Cambridge, MA

    Google Scholar 

  36. Lodhi H, Yamanishi Y (2010) Chemoinformatics and advanced machine learning perspectives: complex computational methods and collaborative techniques. IGI Global, Hershey

    Google Scholar 

  37. Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Camb. Univ. Press, Cambridge

    Book  Google Scholar 

  38. Campillos M, Kuhn M, Gavin A, Jensen L, Bork P (2008) Drug target identification using side-effect similarity. Science 321(5886):263–266

    Article  CAS  Google Scholar 

  39. Yamanishi Y, Kotera M, Kanehisa M, Goto S (2010) Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework. Bioinformatics 26:i246–i254

    Article  CAS  Google Scholar 

  40. Atias N, Sharan R (2010) An algorithmic framework for predicting side-effects of drugs. Proceedings of the 14th international conference on computational molecular biology (RECOMB 2010). pp 1–14

    Google Scholar 

  41. Kashima H, Tsuda K, Akihiro Inokuchi A (2003) Marginalized kernels between labeled graphs, Proceedings of ICML, 3. pp 321–328

    Google Scholar 

  42. Mahe P, Ueda N, Akutsu T, Perret J-L, Vert J-P (2005) Graph kernels for molecular structure-activity relationship analysis with support vector machines. J Chem Inf Model 45(4):939–951

    Article  CAS  Google Scholar 

  43. Leslie C, Eskin E, Noble WS (2002) The spectrum kernel: a string kernel for SVM protein classification. In: Altman RB, Dunker AK, Hunter L, Lauerdale K, Klein TE (eds) Proceedings of the pacific symposium on biocomputing 2002. World Scientific, Singapore, pp 564–575

    Google Scholar 

  44. Leslie C, Eskin E, Weston J, Noble WS (2003) Mismatch string kernels for SVM protein classification. In: Becker S, Thrun S, Obermayer K (eds) Advances in neural information processing systems. MIT Press, Cambridge, p 15

    Google Scholar 

  45. Mahe P, Ralaivola L, Stoven V, Vert J (2006) The pharmacophore kernel for virtual screening with support vector machines. J Chem Inf Model 46:2003–2014

    Article  CAS  Google Scholar 

  46. Kratochwil N, Malherbe P, Lindemann L, Ebeling M, Hoener M, Muhlemann A, Porter R, Stahl M, Gerber P (2005) An automated system for the analysis of g protein-coupled receptor transmembrane binding pockets: Alignment, receptor-based pharmacophores, and their application. J Chem Inf Model 45:1324–1336

    Article  CAS  Google Scholar 

Download references

Acknowledgments

This work is supported by JST PRESTO Grant Number JPMJPR15D8.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yoshihiro Yamanishi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Yamanishi, Y. (2018). Linear and Kernel Model Construction Methods for Predicting Drug–Target Interactions in a Chemogenomic Framework. In: Brown, J. (eds) Computational Chemogenomics. Methods in Molecular Biology, vol 1825. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-8639-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-8639-2_12

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-8638-5

  • Online ISBN: 978-1-4939-8639-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics