Abstract
We introduce a disentangled representation for cellular identity that constructs a latent cellular state from a linear combination of condition specific basis vectors that are then decoded into gene expression levels. The basis vectors are learned with a deep autoencoder model from single-cell RNA-seq data. Linear arithmetic in the disentangled representation successfully predicts nonlinear gene expression interactions between biological pathways in unobserved treatment conditions. We are able to recover the mean gene expression profiles of unobserved conditions with an average Pearson r = 0.73, which outperforms two linear baselines, one with an average r = 0.43 and another with an average r = 0.19. Disentangled representations hold the promise to provide new explanatory power for the interaction of biological pathways and the prediction of effects of unobserved conditions for applications such as combinatorial therapy and cellular reprogramming. Our work is motivated by recent advances in deep generative models that have enabled synthesis of images and natural language with desired properties from interpolation in a “latent representation” of the data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al-Lazikani, B., Banerji, U., Workman, P.: Combinatorial drug therapy for cancer in the post-genomic era. Nat. Biotechnol. 30(7), 679 (2012)
Ghahramani, A., Watt, F.M., Luscombe, N.M.: Generative adversarial networks simulate gene expression and predict perturbations in single cells. bioArXiv preprint (2018). https://doi.org/10.1101/262501
Bojanowski, P., Joulin, A., Lopez-Paz, D., Szlam, A.: Optimizing the latent space of generative networks. arXiv preprint arXiv:1707.05776 (2017)
Ding, J., Condon, A., Shah, S.P.: Interpretable dimensionality reduction of single cell transcriptome data with deep generative models. Nat. Commun. 9(1), 2002 (2018)
Eguchi, A., et al.: Reprogramming cell fate with a genome-scale library of artificial transcription factors. Proc. National Acad. Sci. 113(51), E8257–E8266 (2016)
Ferdous, M.M., Bao, Y., Vinciotti, V., Liu, X., Wilson, P.: Predicting gene expression from genome wide protein binding profiles. Neurocomputing 275, 1490–1499 (2018)
Gómez-Bombarelli, R., et al.: Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4(2), 268–276 (2018)
Yeo, G.H.T., Lin, L., Qi, Y.C., Gifford, D.K., Sherwood, R.I.: Elucidation of combinatorial signaling logic with multiplexed barcodelet single-cell RNA-seq (2018, in prep)
Jaitin, D.A., et al.: Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science 343(6172), 776–779 (2014)
Kendall, A., Gal, Y.: What uncertainties do we need in Bayesian deep learning for computer vision? In: Advances in Neural Information Processing Systems, pp. 5574–5584 (2017)
Kingma, D.P., Mohamed, S., Rezende, D.J., Welling, M.: Semi-supervised learning with deep generative models. In: Advances in Neural Information Processing Systems, pp. 3581–3589 (2014)
Li, H., Xu, Z., Taylor, G., Goldstein, T.: Visualizing the loss landscape of neural nets. arXiv preprint arXiv:1712.09913 (2017)
Lopez, R., Regier, J., Cole, M., Jordan, M., Yosef, N.: A deep generative model for gene expression profiles from single-cell RNA sequencing. arXiv preprint arXiv:1709.02082 (2017)
Lun, A.T., Bach, K., Marioni, J.C.: Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol. 17(1), 75 (2016)
Macarron, R., et al.: Impact of high-throughput screening in biomedical research. Nat. Rev. Drug Discov. 10(3), 188 (2011)
Mohammadi, S., Ravindra, V., Gleich, D.F., Grama, A.: A geometric approach to characterize the functional identity of single cells. Nat. Commun. 9(1), 1516 (2018)
Okawa, S., et al.: Transcriptional synergy as an emergent property defining cell subpopulation identity enables population shift. Nat. Commun. 9(1), 2595 (2018)
Patel, A.P., et al.: Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344(6190), 1396–1401 (2014)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Salvatier, J., Wiecki, T.V., Fonnesbeck, C.: Probabilistic programming in python using PyMC3. PeerJ Comput. Sci. 2, e55 (2016). https://doi.org/10.7717/peerj-cs.55
Satija, R., Farrell, J.A., Gennert, D., Schier, A.F., Regev, A.: Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33(5), 495 (2015)
Singh, R., Lanchantin, J., Robins, G., Qi, Y.: DeepChrome: deep-learning for predicting gene expression from histone modifications. Bioinformatics 32(17), i639–i648 (2016)
Takahashi, K., et al.: Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 131(5), 861–872 (2007)
Wagner, A., Regev, A., Yosef, N.: Revealing the vectors of cellular identity with single-cell genomics. Nat. Biotechnol. 34(11), 1145 (2016)
Wang, X., Ghasedi Dizaji, K., Huang, H.: Conditional generative adversarial network for gene expression inference. Bioinformatics 34(17), i603–i611 (2018)
White, T.: Sampling generative networks. arXiv preprint arXiv:1609.04468 (2016)
Xie, R., Wen, J., Quitadamo, A., Cheng, J., Shi, X.: A deep auto-encoder model for gene expression prediction. BMC Genomics 18(9), 845 (2017)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)
Acknowledgements
We acknowledge the members of the Gifford and Sherwood labs for helpful discussion.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Z., Yeo, G.H.T., Sherwood, R., Gifford, D. (2019). Disentangled Representations of Cellular Identity. In: Cowen, L. (eds) Research in Computational Molecular Biology. RECOMB 2019. Lecture Notes in Computer Science(), vol 11467. Springer, Cham. https://doi.org/10.1007/978-3-030-17083-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-17083-7_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17082-0
Online ISBN: 978-3-030-17083-7
eBook Packages: Computer ScienceComputer Science (R0)