Weighted K-Means Clustering with Observation Weight for Single-Cell Epigenomic Data
- 28 Downloads
The recent advances in single-cell technologies have enabled us to profile genomic features at unprecedented resolution. Nowadays, we can measure multiple types of genomic features at single-cell resolution, including gene expression, protein-binding, methylation, and chromatin accessibility. One major goal in single-cell genomics is to identify and characterize novel cell types, and clustering methods are essential for this goal. The distinct characteristics in single-cell genomic datasets pose challenges for methodology development. In this work, we propose a weighted K-means algorithm. Through down-weighting cells with low sequencing depth, we show that the proposed algorithm can lead to improved detection of rare cell types in analyzing single-cell chromatin accessibility data. The weight of noisy cells is tuned adaptively. In addition, we incorporate sparsity constraints in our proposed method for simultaneous clustering and feature selection. We also evaluated our proposed methods through simulation studies.
KeywordsSingle-cell genomics Single-cell chromatin accessibility data Rare cell types Weighted K-means clustering Sparse weighted K-means clustering
- 7.Yau, C. (2016). pcaReduce: Hierarchical clustering of single cell transcriptional profiles. BMC Bioinformatics, 17(1), 140.Google Scholar
- 12.Jiang, H., Sohn, L. L., Huang, H., & Chen, L. (2018). Single cell clustering based on cell-pair differentiability correlation and variance analysis. Bioinformatics, 34(21), 3684–3694.Google Scholar
- 30.Buenrostro, J. D., Wu, B., Chang, H. Y., & Greenleaf, W. J. ATAC-seq: A method for assaying chromatin accessibility genome-wide. Current Protocols in Molecular Biology, 109(1), 21–29.Google Scholar