A Nonnegative Robust Linear Model for Deconvolution of Proportions
Estimating mixing rates of a sample mixture is a popular problem in biomedical studies. Recently, it is applied to find immune cell infiltration in tumor samples. The main methodological challenge is tackling the non-Gaussian nature of gene expression data. Although a probabilistic model via Multinomial or Poisson distributions would be a solution, such a model often becomes un-identifiable. An alternative is using robust regression because non-Gaussianity is often manifested as too high or too small expression values. In this article, we propose a non-negative robust linear model (NRLM) approach that yields robust yet interpretable mixing rate estimates. In our simulation study, NRLM shows a robust performance for finding the relative abundance of specified components when a large amount of noise is present. More importantly, our approach accurately estimates the absolute level of the specified components in the presence of un-specified ones. Finally, it shows a superior performance when applied to deep deconvolution of blood samples.