Advertisement

Fundamental Bioinformatic and Chemoinformatic Data Processing

  • J. B. BrownEmail author
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 1825)

Abstract

In order to execute more advanced computational chemogenomic workflows, it is essential to understand the basic data formats and options for processing them. In this chapter, de facto standards for compound and protein representation are explained, with procedures for processing them given. A walkthrough demonstrates the step-by-step processes of downloading a ligand–target database, parsing the bioactivity in the database, automatically retrieving its chemical structures and protein sequences from a command line, and finally converting the structures and sequences into representative machine-ready formats. A basic protocol to visualize the parsed database and look for patterns is also given.

Key words

Chemical data structure Protein data structure Molecular data processing tools Database retrieval Compound–protein visualization 

Notes

Acknowledgments

The author would like to thank Dr. Christin Rakers of Nagoya University for critical reading and suggestions for improvement of the manuscript.

References

  1. 1.
    Caron PR, Mullican MD, Mashal RD et al (2001) Chemogenomic approaches to drug discovery. Curr Opin Chem Biol 5:464–470CrossRefGoogle Scholar
  2. 2.
    Bredel M, Jacoby E (2004) Chemogenomics: an emerging strategy for rapid target and drug discovery. Nat Rev Genet 5:262–275.  https://doi.org/10.1038/nrg1317CrossRefPubMedGoogle Scholar
  3. 3.
    Bleicher KH (2002) Chemogenomics: bridging a drug discovery gap. Curr Med Chem 9:2077–2084.  https://doi.org/10.2174/0929867023368728CrossRefPubMedGoogle Scholar
  4. 4.
    Bunin BA, Siesel A, Morales GA, Bajorath J (2007) Chemoinformatics: theory, practice, & products. Springer, Dordrecht.  https://doi.org/10.1007/1-4020-5001-1CrossRefGoogle Scholar
  5. 5.
    Gasteiger J (2008) Handbook of chemoinformatics. Springer, Dordrecht.  https://doi.org/10.1002/9783527618279CrossRefGoogle Scholar
  6. 6.
    Gasteiger J, Engel T (2003) Chemoinformatics: a textbook. Springer, Dordrecht.  https://doi.org/10.1002/3527601643CrossRefGoogle Scholar
  7. 7.
    Leach AR, Gillet VJ (2007) An introduction to chemoinformatics. Springer, Dordrecht.  https://doi.org/10.1007/978-1-4020-6291-9CrossRefGoogle Scholar
  8. 8.
    Todeschini R, Consonni V (2010) Molecular descriptors for chemoinformatics. Springer, Dordrecht.  https://doi.org/10.1002/9783527628766CrossRefGoogle Scholar
  9. 9.
    Chen YPP (2005) Bioinformatics technologies. Springer, Dordrecht.  https://doi.org/10.1007/b138246CrossRefGoogle Scholar
  10. 10.
    Van der Auwera GA, Carneiro MO, Hartl C et al (2002) Current protocols in bioinformatics. Springer, Dordrecht.  https://doi.org/10.1002/0471250953CrossRefGoogle Scholar
  11. 11.
    Zhang YQ, Rajapakse JC (2008) Machine learning in bioinformatics. Springer, Dordrecht.  https://doi.org/10.1002/9780470397428CrossRefGoogle Scholar
  12. 12.
    Kinser J (2008) Python for bioinformatics. Springer, Dordrecht.  https://doi.org/10.1109/MCSE.2007.58CrossRefGoogle Scholar
  13. 13.
    Polanski A, Kimmel M (2007) Bioinformatics. Springer, Dordrecht.  https://doi.org/10.1007/978-3-540-69022-1CrossRefGoogle Scholar
  14. 14.
    Xiong J (2006) Essential bioinformatics. Springer, Dordrecht.  https://doi.org/10.1017/CBO9780511806087CrossRefGoogle Scholar
  15. 15.
    Jones NC, P a P (2004) An introduction to bioinformatics algorithms. Springer, Dordrecht.  https://doi.org/10.1198/jasa.2006.s110CrossRefGoogle Scholar
  16. 16.
    Heath LS, Ramakrishnan N (2011) Problem solving handbook in computational biology and bioinformatics. Springer, Dordrecht.  https://doi.org/10.1007/978-0-387-09760-2CrossRefGoogle Scholar
  17. 17.
    Dougherty D, O’Reilly T (1988) Unix text processing: ISBN-10: 0672462915, ISBN-13: 978-0672462917Google Scholar
  18. 18.
    Levine JR, Young ML (2004) UNIX for Dummies: ISBN-10 0764541471, ISBN-13 9780764541476Google Scholar
  19. 19.
    Burtch KO (2004) Linux shell scripting with Bash. Book. doi:  https://doi.org/10.1016/j.chemphys.2005.04.044CrossRefGoogle Scholar
  20. 20.
    Barrett DJ (2012) Linux pocket guide. Linux. doi:  https://doi.org/10.1017/CBO9781107415324.004
  21. 21.
    Robbins A (2013) Unix in a nutshell. FEBS J. doi:  https://doi.org/10.1111/febs.12237
  22. 22.
    Stewart JM (2014) Python for scientists. Python Sci. doi:  https://doi.org/10.1017/CBO9781107447875
  23. 23.
    Lutz M (2007) Learning python. Icarus. doi:  https://doi.org/10.1016/0019-1035(89)90077-8CrossRefGoogle Scholar
  24. 24.
    Summerfield M (2010) Programming in Python 3. Text. doi: 9788441526136Google Scholar
  25. 25.
    Bento AP, Gaulton A, Hersey A et al (2014) The ChEMBL bioactivity database: an update. Nucleic Acids Res 42:D1083–D1090.  https://doi.org/10.1093/nar/gkt1031CrossRefGoogle Scholar
  26. 26.
    Gaulton A, Hersey A, Nowotka M et al (2017) The ChEMBL database in 2017. Nucleic Acids Res 45:D945–D954.  https://doi.org/10.1093/nar/gkw1074CrossRefGoogle Scholar
  27. 27.
    Lipinski C, Hopkins A (2004) Navigating chemical space for biology and medicine. Nature 432:855–861CrossRefGoogle Scholar
  28. 28.
    Besnard J, Ruda GF, Setola V et al (2012) Automated design of ligands to polypharmacological profiles. Nature 492:215–220CrossRefGoogle Scholar
  29. 29.
    Hopkins AL (2007) Network pharmacology. Nat Biotechnol 25:1110–1111CrossRefGoogle Scholar
  30. 30.
    Hu Y, Bajorath J (2015) Exploring the scaffold universe of kinase inhibitors. J Med Chem 58:315–332.  https://doi.org/10.1021/jm501237kCrossRefPubMedGoogle Scholar
  31. 31.
    Zhang J, Yang PL, Gray NS (2009) Targeting cancer with small molecule kinase inhibitors. Nat Rev Cancer 9:28–39.  https://doi.org/10.1038/nrc2559CrossRefPubMedGoogle Scholar
  32. 32.
    Lahiry P, Torkamani A, Schork NJ, Hegele RA (2010) Kinase mutations in human disease: interpreting genotype-phenotype relationships. Nat Rev Genet 11:60–74.  https://doi.org/10.1038/nrg2707CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Life Science Informatics Research Unit, Laboratory of Molecular BiosciencesKyoto University Graduate School of MedicineKyotoJapan

Personalised recommendations