Predicting Therapeutic Targets with Integration of Heterogeneous Data Sources

  • Yan-Fen Dai
  • Yin-Ying Wang
  • Xing-Ming Zhao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7986)


Drug target is of great importance for designing new drugs and understanding the molecular mechanism of drug actions. In general, a drug may bind to multiple proteins, some of which are not related to disease-treatment or even lead to side effects. Therefore, it is necessary to discriminate the effect-mediating drug targets, i.e. therapeutic targets, from other proteins. Although a lot of computational approaches have been developed to predict drug targets and achieve partial success, few attention has been paid to predict therapeutic targets. In this work, we present a new framework to predict drug therapeutic targets based on the integration of heterogeneous data sources. In particular, we develop an ensemble classifier, PTEC (Predicting Therapeutic targets with Ensemble Classifier), that can effeciently integrate both drug and protein properties described from distinct perspectives, thereby improving prediction accuracy. The results on benchmark datasets demonstrate that our approach outperforms other popular approaches significantly, implying the effectiveness of our proposed approach. Furthermore, the results indicate that the integration of different data sources can not only improve the coverage of predicted targets but also the prediction precision. In other words, distinct data sources indeed complement with each other, and the integration of these heterogeneous data sources can improve the prediction accuracy.


Therapeutic Target Anatomic Therapeutic Chemical Improve Prediction Accuracy Heterogeneous Data Source High True Positive Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Yildirim, M.A., Goh, K.I., Cusick, M.E., Barabasi, A.L., Vidal, M.: Drug-target network. Nat. Biotechnol. 25, 1119–1126 (2007)CrossRefGoogle Scholar
  2. 2.
    Yabuuchi, H., Niijima, S., Takematsu, H., Ida, T., Hirokawa, T., Hara, T., et al.: Analysis of multiple compound-protein interactions reveals novel bioactive molecules. Mol. Syst. Biol. 7, 472 (2011)CrossRefGoogle Scholar
  3. 3.
    Keiser, M.J., Setola, V., Irwin, J.J., Laggner, C., Abbas, A.I., et al.: Predicting new molecular targets for known drugs. Nature 462, 175–181 (2009)CrossRefGoogle Scholar
  4. 4.
    Zhao, S., Li, S.: Network-based relating pharmacological and genomic spaces for drug target identification. PLoS ONE 5, e11764 (2010)Google Scholar
  5. 5.
    Campillos, M., Kuhn, M., Gavin, A.C., Jensen, L.J., Bork, P., et al.: Drug target identification using side-effect similarity. Science 321, 263–266 (2008)CrossRefGoogle Scholar
  6. 6.
    Zhao, X.M., Chen, L., Aihara, K.: A discriminative approach for identifying domain-domain interactions from protein-protein interactions. Proteins 78, 1243–1253 (2010)CrossRefGoogle Scholar
  7. 7.
    Wang, Y.Y., Nacher, J.C., Zhao, X.M.: Predicting drug targets based on protein domains. Mol. Biosyst. 8, 1528–1534 (2012)CrossRefGoogle Scholar
  8. 8.
    Yamanishi, Y., Araki, M., Gutteridge, A., Honda, W., Kanehisa, M.: Prediction of drug-target interaction networks from the integration of chemical and genomic spaces. Bioinformatic 24, i232–i240 (2008)Google Scholar
  9. 9.
    Yamanishi, Y., Kotera, M., Kanehisa, M., Goto, S.: Drug-target interaction prediction from chemical, genomic and pharmacological data in an integrated framework. Bioinformatic 26, i246–i254 (2010)Google Scholar
  10. 10.
    Wishart, D.S., Knox, C., Guo, A.C., Shrivastava, S., Hassanali, M.: DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic. Acids. Res. 34, D668–D672 (2006)Google Scholar
  11. 11.
    Kuhn, M., Szklarczyk, D., Franceschini, A., et al.: STITCH 3: zooming in on protein-chemical interactions. Nucleic. Acids. Res. 40, D876–D880 (2012)Google Scholar
  12. 12.
    Gregori-Puigjane, E., Setola, V., Hert, J., Crews, B.A., Irwin, J.J., et al.: Identifying mechanism-of-action targets for drugs and probes. Proc. Natl. Acad. Sci. U S A 109, 11178–11183 (2012)CrossRefGoogle Scholar
  13. 13.
    Rask-Andersen, M., Almen, M.S., Schioth, H.: Trends in the exploitation of novel drug targets. Nat. Rev. Drug. Discov. 10, 579–590 (2011)CrossRefGoogle Scholar
  14. 14.
    Hopkins, A.L., Groom, C.R.: The druggable genome. Nat. Rev. Drug. Discov. 1, 727–730 (2002)CrossRefGoogle Scholar
  15. 15.
    Zhao, X.M., Iskar, M., Zeller, G., Kuhn, M., van Noort, V., Bork, P.: Prediction of drug combinations by integrating molecular and pharmacological data. PLoS. Comput. Biol. 7, e1002323 (2011)Google Scholar
  16. 16.
    Wang, Y.L., Xiao, J.W., Suzek, T.O., et al.: PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic. Acids. Res. 37, W623–W633 (2008)Google Scholar
  17. 17.
    Apaweiler, R., Bairoch, A., Wu, C.H., Barker, W.C., Boeckmann, B., et al.: UniProt: the Universal Protein knowledgebase. Nucleic. Acids. Res. 32, D115–D119 (2004)Google Scholar
  18. 18.
    Michael, A., Catherine, A.B., Judith, A.B., David, B., Heather, B., et al.: Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000)CrossRefGoogle Scholar
  19. 19.
    Kanehisa, M., Goto, S.: KEGG: kyoto encyclopedia of genes and genomes. Nucleic. Acids. Res. 28, 27–30 (2000)CrossRefGoogle Scholar
  20. 20.
    Ge, X., Yamamoto, S., Tsutsumi, S., Midorikawa, Y., Ihara, S., Wang, S.M., Aburatani, H.: Interpreting expression profiles of cancers by genome-wide survey of breadth of expression in normal tissues. Genomics 86, 127–141 (2005)CrossRefGoogle Scholar
  21. 21.
    Steinbeck, C., Han, Y., Kuhn, S., Horlacher, O., Luttmann, E., Willighagen, E.: The Chemistry Development Kit (CDK): an open-source Java library for Chemo- and Bioinformatics. J. Chem. Inf. Comput. Sci. 43, 493–500 (2003)CrossRefGoogle Scholar
  22. 22.
    Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)CrossRefGoogle Scholar
  23. 23.
    Zhao, X.M., Li, X., Chen, L., Aihara, K.: Protein classification with imbalanced data. Proteins 70, 1125–1132 (2008)CrossRefGoogle Scholar
  24. 24.
    Zhu, F., Shi, Z., Qin, C., Tao, L., Liu, X., Xu, F., et al.: Therapeutic target database update 2012: a resource for facilitating target-oriented drug discovery. Nucleic. Acids. Res. 40, D1128–D1136 (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Yan-Fen Dai
    • 1
    • 2
  • Yin-Ying Wang
    • 1
    • 3
  • Xing-Ming Zhao
    • 4
  1. 1.Institute of Systems BiologyShanghai UniversityShanghaiChina
  2. 2.Department of MathematicsShanghai UniversityShanghaiChina
  3. 3.School of Communication and Information EngineeringShanghai UniversityShanghaiChina
  4. 4.School of Electronics and Information EngineeringTongji UniversityShanghaiChina

Personalised recommendations