Abstract
Unprecedented technological advances in single-cell RNA-sequencing (scRNA-seq) technology have now made it possible to profile genome-wide expression in single cells at low cost and high throughput. There is substantial ongoing effort to use scRNA-seq measurements to identify the “cell types” that form components of a complex tissue, akin to taxonomizing species in ecology. Cell type classification from scRNA-seq data involves the application of computational tools rooted in dimensionality reduction and clustering, and statistical analysis to identify molecular signatures that are unique to each type. As datasets continue to grow in size and complexity, computational challenges abound, requiring analytical methods to be scalable, flexible, and robust. Moreover, careful consideration needs to be paid to experimental biases and statistical challenges that are unique to these measurements to avoid artifacts. This chapter introduces these topics in the context of cell-type identification, and outlines an instructive step-by-step example bioinformatic pipeline for researchers entering this field.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Vickaryous MK, Hall BK (2006) Human cell type diversity, evolution, development, and classification with special reference to cells derived from the neural crest. Biol Rev Camb Philos Soc 81(3):425–455
Regev A et al (2017) The human cell atlas. Elife:6
Tosches MA et al (2018) Evolution of pallium, hippocampus, and cortical cell types revealed by single-cell transcriptomics in reptiles. Science 360(6391):881–888
Boisset JC et al (2018) Mapping the physical network of cellular interactions. Nat Methods
Tanay A, Regev A (2017) Scaling single-cell genomics from phenomenology to mechanism. Nature 541(7637):331–338
Trapnell C (2015) Defining cell types and states with single-cell genomics. Genome Res 25(10):1491–1498
Cleary B et al (2017) Efficient generation of transcriptomic profiles by random composite measurements. Cell 171(6):1424–1436.e18
Klein AM et al (2015) Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161(5):1187–1201
Macosko EZ et al (2015) Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161(5):1202–1214
Zheng GX et al (2017) Massively parallel digital transcriptional profiling of single cells. Nat Commun 8:14049
Habib N et al (2016) Div-Seq: single-nucleus RNA-Seq reveals dynamics of rare adult newborn neurons. Science 353(6302):925–928
Lake BB et al (2016) Neuronal subtypes and diversity revealed by single-nucleus RNA sequencing of the human brain. Science 352(6293):1586–1590
Shekhar K et al (2016) Comprehensive classification of retinal bipolar neurons by single-cell transcriptomics. Cell 166(5):1308–1323.e30
Villani A-C et al (2017) Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors. Science 356(6335):eaah4573
Tasic B et al (2016) Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat Neurosci 19(2):335–346
Zeng H, Sanes JR (2017) Neuronal cell-type classification: challenges, opportunities and the path forward. Nat Rev Neurosci 18(9):530
Stegle O, Teichmann SA, Marioni JC (2015) Computational and analytical challenges in single-cell transcriptomics. Nat Rev Genet 16(3):133
Arendt D (2008) The evolution of cell types in animals: emerging principles from molecular studies. Nat Rev Genet 9(11):868–882
Ecker JR et al (2017) The BRAIN initiative cell census consortium: lessons learned toward generating a comprehensive BRAIN cell atlas. Neuron 96(3):542–557
Kolodziejczyk AA et al (2015) The technology and biology of single-cell RNA sequencing. Mol Cell 58(4):610–620
Islam S et al (2014) Quantitative single-cell RNA-seq with unique molecular identifiers. Nat Methods 11(2):163
Menon V (2017) Clustering single cells: a review of approaches on high- and low-depth single-cell RNA-seq data. Brief Funct Genomics
Hicks SC, Teng M, Irizarry RA (2015, 025528) On the widespread and critical impact of systematic bias and batch effects in single-cell RNA-Seq data. bioRxiv
Butler A et al (2018) Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol 36(5):411
Haghverdi L et al (2018) Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat Biotechnol 36:421–427
Lopez R et al (2018) Bayesian inference for a generative model of transcriptome profiles from single-cell RNA sequencing. bioRxiv:292037
Lee JH et al (2014) Highly multiplexed subcellular RNA sequencing in situ. Science 343(6177):1360–1363
Stahl PL et al (2016) Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 353(6294):78–82
Chen KH et al (2015) Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348(6233):aaa6090
Lubeck E et al (2014) Single-cell in situ RNA profiling by sequential hybridization. Nat Methods 11(4):360
Fuzik J et al (2016) Integration of electrophysiological recordings with single-cell RNA-seq data identifies neuronal subtypes. Nat Biotechnol 34(2):175
Dixit A et al (2016) Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167(7):1853–1866.e17
Stoeckius M et al (2017) Simultaneous epitope and transcriptome measurement in single cells. Nat Methods 14(9):865
Frieda KL et al (2017) Synthetic recording and in situ readout of lineage information in single cells. Nature 541(7635):107–111
Raj B et al (2018) Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain. Nat Biotechnol 36(5):442–450
Pertea M et al (2016) Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc 11(9):1650
Villani AC, Shekhar K (2017) Single-cell RNA sequencing of human T cells. Methods Mol Biol 1514:203–239
Satija R et al (2015) Spatial reconstruction of single-cell gene expression data. Nat Biotechnol 33(5):495–502
Lake BB et al (2018) Integrative single-cell analysis of transcriptional and epigenetic states in the human adult brain. Nat Biotechnol 36(1):70–80
Pandey S et al (2018) Comprehensive identification and spatial mapping of Habenular neuronal types using single-cell RNA-Seq. Curr Biol 28(7):1052–1065.e7
Andrews TS, Hemberg M (2017) Identifying cell populations with scRNASeq. Mol Asp Med
Brennecke P et al (2013) Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods 10(11):1093
Keogh E, Mueen A (2017) Curse of dimensionality. In: Encyclopedia of machine learning and data mining. Springer, pp 314–315
Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24(6):417
Hyvärinen A, Karhunen J, Oja E (2004) Independent component analysis, vol 46. Wiley, New York
Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Leen TK, Dietterich TG, Tresp V (eds) Advances in neural information processing systems, vol 13. MIT, Cambridge, UK
Haghverdi L et al (2016) Diffusion pseudotime robustly reconstructs lineage branching. Nat Methods 13(10):845
Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E Stat Nonlinear Soft Matter Phys 80(5 Pt 2):056117
Levine JH et al (2015) Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell 162(1):184–197
LVD M, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(Nov):2579–2605
Soneson C, Robinson MD (2018) Bias, robustness and scalability in single-cell differential expression analysis. Nat Methods 15(4):255
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Acknowledgments
K. S. would like to acknowledge support from NIH 1K99EY028625-01, the Klarman Cell Observatory, and the laboratory of Dr. Aviv Regev at the Broad Institute. We would like to gratefully acknowledge critical feedback from Drs. Inbal Benhar and Jose Ordovas-Montanes.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Shekhar, K., Menon, V. (2019). Identification of Cell Types from Single-Cell Transcriptomic Data. In: Yuan, GC. (eds) Computational Methods for Single-Cell Data Analysis. Methods in Molecular Biology, vol 1935. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-9057-3_4
Download citation
DOI: https://doi.org/10.1007/978-1-4939-9057-3_4
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-9056-6
Online ISBN: 978-1-4939-9057-3
eBook Packages: Springer Protocols