A Nonnegative Robust Linear Model for Deconvolution of Proportions

  • Hyonho ChunEmail author
  • Hyuna Yang
Part of the ICSA Book Series in Statistics book series (ICSABSS)


Estimating mixing rates of a sample mixture is a popular problem in biomedical studies. Recently, it is applied to find immune cell infiltration in tumor samples. The main methodological challenge is tackling the non-Gaussian nature of gene expression data. Although a probabilistic model via Multinomial or Poisson distributions would be a solution, such a model often becomes un-identifiable. An alternative is using robust regression because non-Gaussianity is often manifested as too high or too small expression values. In this article, we propose a non-negative robust linear model (NRLM) approach that yields robust yet interpretable mixing rate estimates. In our simulation study, NRLM shows a robust performance for finding the relative abundance of specified components when a large amount of noise is present. More importantly, our approach accurately estimates the absolute level of the specified components in the presence of un-specified ones. Finally, it shows a superior performance when applied to deep deconvolution of blood samples.


  1. Abbas, A.R., et al.: Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus. PloS One 4(7), e6098 (2009). CrossRefGoogle Scholar
  2. Gong, T., et al.: Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples. PloS One 6(11), e27156 (2011). CrossRefGoogle Scholar
  3. Li, B., Liu, J.S., Liu, S.L.: Revisit linear regression-based deconvolution methods for tumor gene expression data. Genome Biol. 18, 127 (2017). CrossRefGoogle Scholar
  4. Neweman, A.M., et al.: Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 12, 453–457 (2015)CrossRefGoogle Scholar
  5. Qiao, W., et al.: PERT: a method for expression deconvolution of human blood samples from varied microenvironmental and developmental conditions. PloS Comput. Biol. 8(12), e1002838 (2012). CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Mathematics and StatisticsBoston UniversityBostonUSA
  2. 2.IBM Watson HealthCambridgeUSA

Personalised recommendations