Prediction of Drug Efficiency by Transferring Gene Expression Data from Cell Lines to Cancer Patients
The paper represents a novel approach for individual medical treatment in oncology, based on machine learning with transferring gene expression data, obtained on cell lines, onto individual cancer patients for drug efficiency prediction. We give a detailed analysis how to build drug response classifiers, on the example of three experimental pairs of data “kind of cancer/chosen drug for treatment”. The main hardness of the problem was the meager size of patient training data: it is many many hundred times smaller than a dimensionality of original feature space.
The core feature of our transfer technique is to avoid extrapolation in the feature space when make any predictions of the clinical outcome of the treatment for a patient using gene expression data for cell lines. We can assure that there is no extrapolation by special selection of dimensions of the feature space, which provide sufficient number, say M, of cell line points both below and above any point that correspond to a patient. Additionally, in a manner that is a little similar to the k nearest neighbor (kNN) method, after the selection of feature subspace, we take into account only K cell line points that are closer to a patient’s point in the selected subspace. Having varied different feasible values of K and M, we showed that the predictor’s accuracy considered AUC, for all three cases of cancer-like diseases are equal or higher than 0.7.
This work was supported by the Russian Science Foundation grant 18-15-00061.
Disclosure of Interests
The authors declare no conflicts of interests.
- 2.Lopez-Paz, D., Bottou, L., Schölkopf, B., Vapnik, V.: Unifying distillation and privileged information. In: ICLR 2016, San Juan, Puerto Rico (2016)Google Scholar
- 3.Xu, X., Zhou, J.T., Tsang, I., Qin, Z., Goh, R.S.M., Liu, Y.: Simple and efficient learning using privileged information (2016)Google Scholar
- 4.Celik, Z.B., Izmailov, R., McDaniel, P.: Proof and implementation of algorithmic realization of learning using privileged information (LUPI). In: Paradigm: SVM+. Institute of Networking and Security Research (INSR) (2015)Google Scholar
- 7.Minsky, M.L., Papert, S.A.: Perceptrons - Expanded Edition: An Introduction to Computational Geometry. MIT Press, Boston (1987)Google Scholar
- 9.Crossman, L.C., et al.: In chronic myeloid leukemia white cells from cytogenetic responders and non-responders to imatinib have very similar gene expression signatures. Haematologica 90, 459–464 (2005)Google Scholar
- 13.Osuna, E., Freund, R., Girosi, F.: An improved training algorithm for support vector machines, pp. 276–85. IEEE (1997). http://ieeexplore.ieee.org/document/622408/. Accessed 23 May 2017
- 14.Bartlett, P., Shawe-Taylor, J.: Generalization performance of support vector machines and other pattern classifiers. In: Advances in Kernel Methods. Support Vector Learn, pp. 43–54 (1999)Google Scholar
- 16.Buzdin, A.A., et al.: Oncofinder, a new method for the analysis of intracellular signaling pathway activation using transcriptomic data. Front Genet. 5, 55 (2014)Google Scholar
- 17.Buzdin, A.A., Prassolov, V., Zhavoronkov, A.A., Borisov, N.M.: Bioinformatics meets biomedicine: oncofinder, a quantitative approach for interrogating molecular pathways using gene expression data. Methods Mol. Biol. 1613, 53–83 (2017). Clifton NJ.Google Scholar
- 20.Kuzmina, N.B., Borisov, N.M.: Handling complex rule-based models of mitogenic cell signaling (On the example of ERK activation upon EGF stimulation). Int. Proc. Chem. Biol. Env. Eng. 5, 76–82 (2011)Google Scholar
- 30.Wang, Q., Liu, X.: Screening of feature genes in distinguishing different types of breast cancer using support vector machine. OncoTargets Ther. 8, 2311–2317 (2015)Google Scholar