Skip to main content

Prediction of Drug Efficiency by Transferring Gene Expression Data from Cell Lines to Cancer Patients

  • Chapter
  • First Online:
Braverman Readings in Machine Learning. Key Ideas from Inception to Current State

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11100))

Abstract

The paper represents a novel approach for individual medical treatment in oncology, based on machine learning with transferring gene expression data, obtained on cell lines, onto individual cancer patients for drug efficiency prediction. We give a detailed analysis how to build drug response classifiers, on the example of three experimental pairs of data “kind of cancer/chosen drug for treatment”. The main hardness of the problem was the meager size of patient training data: it is many many hundred times smaller than a dimensionality of original feature space.

The core feature of our transfer technique is to avoid extrapolation in the feature space when make any predictions of the clinical outcome of the treatment for a patient using gene expression data for cell lines. We can assure that there is no extrapolation by special selection of dimensions of the feature space, which provide sufficient number, say M, of cell line points both below and above any point that correspond to a patient. Additionally, in a manner that is a little similar to the k nearest neighbor (kNN) method, after the selection of feature subspace, we take into account only K cell line points that are closer to a patient’s point in the selected subspace. Having varied different feasible values of K and M, we showed that the predictor’s accuracy considered AUC, for all three cases of cancer-like diseases are equal or higher than 0.7.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In that one could find an analogy between the situation and very popular now case, called “domain adaptation” [5].

References

  1. Vapnik, V., Izmailov, R.: Learning using privileged information: similarity control and knowledge transfer. J. Mach. Learn. Res. 16, 2023–2049 (2015)

    MathSciNet  MATH  Google Scholar 

  2. Lopez-Paz, D., Bottou, L., Schölkopf, B., Vapnik, V.: Unifying distillation and privileged information. In: ICLR 2016, San Juan, Puerto Rico (2016)

    Google Scholar 

  3. Xu, X., Zhou, J.T., Tsang, I., Qin, Z., Goh, R.S.M., Liu, Y.: Simple and efficient learning using privileged information (2016)

    Google Scholar 

  4. Celik, Z.B., Izmailov, R., McDaniel, P.: Proof and implementation of algorithmic realization of learning using privileged information (LUPI). In: Paradigm: SVM+. Institute of Networking and Security Research (INSR) (2015)

    Google Scholar 

  5. Csurka, G.: Domain Adaptation in Computer Vision Applications. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58347-1

    Book  Google Scholar 

  6. Artemov, A., et al.: A method for predicting target drug efficiency in cancer based on the analysis of signaling pathway activation. Oncotarget 6, 29347–29356 (2015)

    Article  Google Scholar 

  7. Minsky, M.L., Papert, S.A.: Perceptrons - Expanded Edition: An Introduction to Computational Geometry. MIT Press, Boston (1987)

    Google Scholar 

  8. Blumenschein, G.R., et al.: Comprehensive biomarker analysis and final efficacy results of sorafenib in the BATTLE trial. Clin. Cancer Res 19, 6967–6975 (2013). Off. J. Am. Assoc. Cancer Res.

    Article  Google Scholar 

  9. Crossman, L.C., et al.: In chronic myeloid leukemia white cells from cytogenetic responders and non-responders to imatinib have very similar gene expression signatures. Haematologica 90, 459–464 (2005)

    Google Scholar 

  10. Mulligan, G., et al.: Gene expression profiling and correlation with outcome in clinical trials of the proteasome inhibitor bortezomib. Blood 109, 3177–3188 (2007)

    Article  Google Scholar 

  11. Yang, W., et al.: Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 41, D955–D961 (2013)

    Article  Google Scholar 

  12. Robin, X., Turck, N., Hainard, A., Lisacek, F., Sanchez, J.-C., Müller, M.: Bioinformatics for protein biomarker panel classification: what is needed to bring biomarker panels into in vitro diagnostics? Expert Rev. Proteomics 6, 675–689 (2009)

    Article  Google Scholar 

  13. Osuna, E., Freund, R., Girosi, F.: An improved training algorithm for support vector machines, pp. 276–85. IEEE (1997). http://ieeexplore.ieee.org/document/622408/. Accessed 23 May 2017

  14. Bartlett, P., Shawe-Taylor, J.: Generalization performance of support vector machines and other pattern classifiers. In: Advances in Kernel Methods. Support Vector Learn, pp. 43–54 (1999)

    Google Scholar 

  15. Toloşi, L., Lengauer, T.: Classification with correlated features: unreliability of feature ranking and solutions. Bioinformatics 27, 1986–1994 (2011)

    Article  Google Scholar 

  16. Buzdin, A.A., et al.: Oncofinder, a new method for the analysis of intracellular signaling pathway activation using transcriptomic data. Front Genet. 5, 55 (2014)

    Google Scholar 

  17. Buzdin, A.A., Prassolov, V., Zhavoronkov, A.A., Borisov, N.M.: Bioinformatics meets biomedicine: oncofinder, a quantitative approach for interrogating molecular pathways using gene expression data. Methods Mol. Biol. 1613, 53–83 (2017). Clifton NJ.

    Google Scholar 

  18. Aliper, A.M., et al.: Mathematical justification of expression-based pathway activation scoring (PAS). Methods Mol. Biol. 1613, 31–51 (2017). Clifton NJ

    Article  Google Scholar 

  19. Borisov, N., et al.: Data aggregation at the level of molecular pathways improves stability of experimental transcriptomic and proteomic data. Cell Cycle 16(19), 1810–1823 (2017). Georget Tex

    Article  Google Scholar 

  20. Kuzmina, N.B., Borisov, N.M.: Handling complex rule-based models of mitogenic cell signaling (On the example of ERK activation upon EGF stimulation). Int. Proc. Chem. Biol. Env. Eng. 5, 76–82 (2011)

    Google Scholar 

  21. Karlsson, J., et al.: Clear cell sarcoma of the kidney demonstrates an embryonic signature indicative of a primitive nephrogenic origin. Genes Chromosomes Cancer 53, 381–391 (2014)

    Article  Google Scholar 

  22. Kabbout, M., et al.: ETS2 mediated tumor suppressive function and MET oncogene inhibition in human non-small cell lung cancer. Clin. Cancer Res 19, 3383–3395 (2013). Off. J. Am. Assoc. Cancer Res.

    Article  Google Scholar 

  23. Yagi, T., et al.: Identification of a gene expression signature associated with pediatric AML prognosis. Blood 102, 1849–1856 (2003)

    Article  Google Scholar 

  24. Hodgson, J.G., et al.: Comparative analyses of gene copy number and mRNA expression in glioblastoma multiforme tumors and xenografts. Neuro-Oncology 11, 477–487 (2009)

    Article  Google Scholar 

  25. Bhasin, M., Yuan, L., Keskin, D.B., Otu, H.H., Libermann, T.A., Oettgen, P.: Bioinformatic identification and characterization of human endothelial cell-restricted genes. BMC Genom. 11, 342 (2010)

    Article  Google Scholar 

  26. Cheng, Y., Prusoff, W.H.: Relationship between the inhibition constant (K1) and the concentration of inhibitor which causes 50 per cent inhibition (I50) of an enzymatic reaction. Biochem. Pharmacol. 22, 3099–3108 (1973)

    Article  Google Scholar 

  27. Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46, 175–185 (1992)

    MathSciNet  Google Scholar 

  28. Shabalin, A.A., Tjelmeland, H., Fan, C., Perou, C.M., Nobel, A.B.: Merging two gene-expression studies via cross-platform normalization. Bioinformatics 24, 1154–1160 (2008)

    Article  Google Scholar 

  29. Rudy, J., Valafar, F.: Empirical comparison of cross-platform normalization methods for gene expression data. BMC Bioinform. 12, 467 (2011)

    Article  Google Scholar 

  30. Wang, Q., Liu, X.: Screening of feature genes in distinguishing different types of breast cancer using support vector machine. OncoTargets Ther. 8, 2311–2317 (2015)

    Google Scholar 

  31. Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (2011)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Russian Science Foundation grant 18-15-00061.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicolas Borisov .

Editor information

Editors and Affiliations

Ethics declarations

The authors declare no conflicts of interests.

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 678 KB)

Appendices: Materials and Methods

Appendices: Materials and Methods

Transcriptome Profiling for Renal Cancer Samples

The details of experimental procedure at Illumina HumanHT-12v4 and CustomArray ECD 4X2K/12K platform were reported previously [19]. Raw expression data were deposited in the GEO database (http://www.ncbi.nlm.nih.gov/geo/), accession numbers GSE52519 and GSE65635.

Harmonization of Illumina and Custom Array Expression Profiles for Renal Cancer

To cross-harmonize the results for the Illumina and CustomArray gene expression profiling, all expression profiles were transformed with the XPN method [28] using the R package CONOR [29].

SVM, Binary Tree and Random Forest Machine Learning Procedures

All the SVM calculations were performed using the R package ‘e1071’ [30], that employs the C++ library ‘libsvm’ [31]. Calculations according to binary tree [14] and random forest [15] methods were done with the R packages ‘rpart’ and ‘randomForest’, respectively.

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Borisov, N., Tkachev, V., Buzdin, A., Muchnik, I. (2018). Prediction of Drug Efficiency by Transferring Gene Expression Data from Cell Lines to Cancer Patients. In: Rozonoer, L., Mirkin, B., Muchnik, I. (eds) Braverman Readings in Machine Learning. Key Ideas from Inception to Current State. Lecture Notes in Computer Science(), vol 11100. Springer, Cham. https://doi.org/10.1007/978-3-319-99492-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99492-5_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99491-8

  • Online ISBN: 978-3-319-99492-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics