Prediction of Drug Efficiency by Transferring Gene Expression Data from Cell Lines to Cancer Patients

  • Nicolas BorisovEmail author
  • Victor Tkachev
  • Anton Buzdin
  • Ilya Muchnik
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11100)


The paper represents a novel approach for individual medical treatment in oncology, based on machine learning with transferring gene expression data, obtained on cell lines, onto individual cancer patients for drug efficiency prediction. We give a detailed analysis how to build drug response classifiers, on the example of three experimental pairs of data “kind of cancer/chosen drug for treatment”. The main hardness of the problem was the meager size of patient training data: it is many many hundred times smaller than a dimensionality of original feature space.

The core feature of our transfer technique is to avoid extrapolation in the feature space when make any predictions of the clinical outcome of the treatment for a patient using gene expression data for cell lines. We can assure that there is no extrapolation by special selection of dimensions of the feature space, which provide sufficient number, say M, of cell line points both below and above any point that correspond to a patient. Additionally, in a manner that is a little similar to the k nearest neighbor (kNN) method, after the selection of feature subspace, we take into account only K cell line points that are closer to a patient’s point in the selected subspace. Having varied different feasible values of K and M, we showed that the predictor’s accuracy considered AUC, for all three cases of cancer-like diseases are equal or higher than 0.7.



This work was supported by the Russian Science Foundation grant 18-15-00061.

Disclosure of Interests

The authors declare no conflicts of interests.

Supplementary material (679 kb)
Supplementary material 1 (zip 678 KB)


  1. 1.
    Vapnik, V., Izmailov, R.: Learning using privileged information: similarity control and knowledge transfer. J. Mach. Learn. Res. 16, 2023–2049 (2015)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Lopez-Paz, D., Bottou, L., Schölkopf, B., Vapnik, V.: Unifying distillation and privileged information. In: ICLR 2016, San Juan, Puerto Rico (2016)Google Scholar
  3. 3.
    Xu, X., Zhou, J.T., Tsang, I., Qin, Z., Goh, R.S.M., Liu, Y.: Simple and efficient learning using privileged information (2016)Google Scholar
  4. 4.
    Celik, Z.B., Izmailov, R., McDaniel, P.: Proof and implementation of algorithmic realization of learning using privileged information (LUPI). In: Paradigm: SVM+. Institute of Networking and Security Research (INSR) (2015)Google Scholar
  5. 5.
    Csurka, G.: Domain Adaptation in Computer Vision Applications. Springer, Cham (2017). Scholar
  6. 6.
    Artemov, A., et al.: A method for predicting target drug efficiency in cancer based on the analysis of signaling pathway activation. Oncotarget 6, 29347–29356 (2015)CrossRefGoogle Scholar
  7. 7.
    Minsky, M.L., Papert, S.A.: Perceptrons - Expanded Edition: An Introduction to Computational Geometry. MIT Press, Boston (1987)Google Scholar
  8. 8.
    Blumenschein, G.R., et al.: Comprehensive biomarker analysis and final efficacy results of sorafenib in the BATTLE trial. Clin. Cancer Res 19, 6967–6975 (2013). Off. J. Am. Assoc. Cancer Res.CrossRefGoogle Scholar
  9. 9.
    Crossman, L.C., et al.: In chronic myeloid leukemia white cells from cytogenetic responders and non-responders to imatinib have very similar gene expression signatures. Haematologica 90, 459–464 (2005)Google Scholar
  10. 10.
    Mulligan, G., et al.: Gene expression profiling and correlation with outcome in clinical trials of the proteasome inhibitor bortezomib. Blood 109, 3177–3188 (2007)CrossRefGoogle Scholar
  11. 11.
    Yang, W., et al.: Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 41, D955–D961 (2013)CrossRefGoogle Scholar
  12. 12.
    Robin, X., Turck, N., Hainard, A., Lisacek, F., Sanchez, J.-C., Müller, M.: Bioinformatics for protein biomarker panel classification: what is needed to bring biomarker panels into in vitro diagnostics? Expert Rev. Proteomics 6, 675–689 (2009)CrossRefGoogle Scholar
  13. 13.
    Osuna, E., Freund, R., Girosi, F.: An improved training algorithm for support vector machines, pp. 276–85. IEEE (1997). Accessed 23 May 2017
  14. 14.
    Bartlett, P., Shawe-Taylor, J.: Generalization performance of support vector machines and other pattern classifiers. In: Advances in Kernel Methods. Support Vector Learn, pp. 43–54 (1999)Google Scholar
  15. 15.
    Toloşi, L., Lengauer, T.: Classification with correlated features: unreliability of feature ranking and solutions. Bioinformatics 27, 1986–1994 (2011)CrossRefGoogle Scholar
  16. 16.
    Buzdin, A.A., et al.: Oncofinder, a new method for the analysis of intracellular signaling pathway activation using transcriptomic data. Front Genet. 5, 55 (2014)Google Scholar
  17. 17.
    Buzdin, A.A., Prassolov, V., Zhavoronkov, A.A., Borisov, N.M.: Bioinformatics meets biomedicine: oncofinder, a quantitative approach for interrogating molecular pathways using gene expression data. Methods Mol. Biol. 1613, 53–83 (2017). Clifton NJ.Google Scholar
  18. 18.
    Aliper, A.M., et al.: Mathematical justification of expression-based pathway activation scoring (PAS). Methods Mol. Biol. 1613, 31–51 (2017). Clifton NJCrossRefGoogle Scholar
  19. 19.
    Borisov, N., et al.: Data aggregation at the level of molecular pathways improves stability of experimental transcriptomic and proteomic data. Cell Cycle 16(19), 1810–1823 (2017). Georget TexCrossRefGoogle Scholar
  20. 20.
    Kuzmina, N.B., Borisov, N.M.: Handling complex rule-based models of mitogenic cell signaling (On the example of ERK activation upon EGF stimulation). Int. Proc. Chem. Biol. Env. Eng. 5, 76–82 (2011)Google Scholar
  21. 21.
    Karlsson, J., et al.: Clear cell sarcoma of the kidney demonstrates an embryonic signature indicative of a primitive nephrogenic origin. Genes Chromosomes Cancer 53, 381–391 (2014)CrossRefGoogle Scholar
  22. 22.
    Kabbout, M., et al.: ETS2 mediated tumor suppressive function and MET oncogene inhibition in human non-small cell lung cancer. Clin. Cancer Res 19, 3383–3395 (2013). Off. J. Am. Assoc. Cancer Res.CrossRefGoogle Scholar
  23. 23.
    Yagi, T., et al.: Identification of a gene expression signature associated with pediatric AML prognosis. Blood 102, 1849–1856 (2003)CrossRefGoogle Scholar
  24. 24.
    Hodgson, J.G., et al.: Comparative analyses of gene copy number and mRNA expression in glioblastoma multiforme tumors and xenografts. Neuro-Oncology 11, 477–487 (2009)CrossRefGoogle Scholar
  25. 25.
    Bhasin, M., Yuan, L., Keskin, D.B., Otu, H.H., Libermann, T.A., Oettgen, P.: Bioinformatic identification and characterization of human endothelial cell-restricted genes. BMC Genom. 11, 342 (2010)CrossRefGoogle Scholar
  26. 26.
    Cheng, Y., Prusoff, W.H.: Relationship between the inhibition constant (K1) and the concentration of inhibitor which causes 50 per cent inhibition (I50) of an enzymatic reaction. Biochem. Pharmacol. 22, 3099–3108 (1973)CrossRefGoogle Scholar
  27. 27.
    Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46, 175–185 (1992)MathSciNetGoogle Scholar
  28. 28.
    Shabalin, A.A., Tjelmeland, H., Fan, C., Perou, C.M., Nobel, A.B.: Merging two gene-expression studies via cross-platform normalization. Bioinformatics 24, 1154–1160 (2008)CrossRefGoogle Scholar
  29. 29.
    Rudy, J., Valafar, F.: Empirical comparison of cross-platform normalization methods for gene expression data. BMC Bioinform. 12, 467 (2011)CrossRefGoogle Scholar
  30. 30.
    Wang, Q., Liu, X.: Screening of feature genes in distinguishing different types of breast cancer using support vector machine. OncoTargets Ther. 8, 2311–2317 (2015)Google Scholar
  31. 31.
    Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (2011)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Nicolas Borisov
    • 1
    • 2
    Email author
  • Victor Tkachev
    • 2
  • Anton Buzdin
    • 1
    • 2
  • Ilya Muchnik
    • 3
  1. 1.Institute for Personalized MedicineI.M. Sechenov First Moscow State Medical UniversityMoscowRussian Federation
  2. 2.Department of R&DOmicsWay Corp.WalnutUSA
  3. 3.Rutgers University, Hill CenterPiscatawayUSA

Personalised recommendations