Sankhya B

pp 1–45 | Cite as

A Blockwise Consistency Method for Parameter Estimation of Complex Models

  • Runmin Shi
  • Faming LiangEmail author
  • Qifan Song
  • Ye Luo
  • Malay Ghosh


The drastic improvement in data collection and acquisition technologies has enabled scientists to collect a great amount of data. With the growing dataset size, typically comes a growing complexity of data structures and of complex models to account for the data structures. How to estimate the parameters of complex models has put a great challenge on current statistical methods. This paper proposes a blockwise consistency approach as a potential solution to the problem, which works by iteratively finding consistent estimates for each block of parameters conditional on the current estimates of the parameters in other blocks. The blockwise consistency approach decomposes the high-dimensional parameter estimation problem into a series of lower-dimensional parameter estimation problems, which often have much simpler structures than the original problem and thus can be easily solved. Moreover, under the framework provided by the blockwise consistency approach, a variety of methods, such as Bayesian and frequentist methods, can be jointly used to achieve a consistent estimator for the original high-dimensional complex model. The blockwise consistency approach is illustrated using high-dimensional linear regression with both univariate and multivariate responses. The results of both problems show that the blockwise consistency approach can provide drastic improvements over the existing methods. Extension of the blockwise consistency approach to many other complex models is straightforward.

Keywords and phrases.

Coordinate descent Gaussian graphical model Multivariate regression Precision matrix Variable selection 

AMS (2000) subject classification.

Primary 62F10 Secondary 62P10 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



Liang’s research was support in part by the grants DMS-1612924, DMS/NIGMS R01-GM117597, and NIGMS R01-GM126089. The authors thank the editor, associate editor, and two referees for their helpful comments which have led to significant improvement of this paper.


  1. Banerjee, M., Durot, C. and Sen, B. (2019). Divide and conquer in non-standard problems and the super-efficiency phenomenon. Ann. Statistics 47, 720–757.Google Scholar
  2. Barretina, J., Caponigro, G., Stransky, N., Venkatesan, K., Margolin, A., Kim, S., Wilson, C., Lehar, J., Kryukov, G., Sonkin, D., Reddy, A., Liu, M., Murray, L., Berger, M., Monahan, J., Morais, P., Meltzer, J., Korejwa, A., Jane-Valbuena, J., Mapa, F., Thibault, J., Bric-Furlong, E., Raman, P., Shipway, A. and Engels, I. (2012). The Cancer cell line encyclopedia enables predictive modeling of anticancer drug sensitivity. Nature 483, 603–607.Google Scholar
  3. Bhadra, A. and Mallick, B. (2013). Joint high-dimensional bayesian variable and covariance selection with an application to eQTL analysis. Biometrics 69, 447–457.MathSciNetzbMATHGoogle Scholar
  4. Breheny, P. and Huang, J. (2011). Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Stat. 5, 232–252.MathSciNetzbMATHGoogle Scholar
  5. Cai, T., Li, H., Liu, W. and Xie, J. (2013). Covariate-adjusted precision matrix estimation iwth an application in genetical genomics. Biometrika 100, 139–156.MathSciNetzbMATHGoogle Scholar
  6. Chen, J. and Chen, Z. (2008). Extended Bayesian information criterion for model selection with large model space. Biometrika 94, 759–771.zbMATHGoogle Scholar
  7. Dempster, A. (1972). Covariance selection. Biometrics 28, 157–175.Google Scholar
  8. Duguet, M. (1997). When helicase and topoisomerase meet. J. Cell Sci. 110, 1345–1350.Google Scholar
  9. Efron, B. and Tibshirani, R. (1993). An introduction to the bootstrap. Chapman & Hall/CRC, Boca Raton.zbMATHGoogle Scholar
  10. Fan, J., Feng, Y., Saldana, D. F., Samworth, R. and Wu, Y. (2015). Sure independence screening. CRAN R Package.Google Scholar
  11. Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360.MathSciNetzbMATHGoogle Scholar
  12. Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space (with discussion). J. R. Stat. Soc. Ser. B 70, 849–911.MathSciNetGoogle Scholar
  13. Fan, J., Samworth, R. and Wu, Y. (2009). Ultrahigh dimensional feature selection: Beyond the linear model. J. Mach. Learn. Res. 10, 1829–1853.MathSciNetzbMATHGoogle Scholar
  14. Fan, J. and Song, R. (2010). Sure independence screening in generalized linear model with NP-dimensionality. Ann. Stat. 38, 3567–3604.MathSciNetzbMATHGoogle Scholar
  15. Fan, J., Xue, L. and Zou, H. (2014). Strong oracle optimality of folded concave penalized estimation. Ann. Stat. 42, 819–849.MathSciNetzbMATHGoogle Scholar
  16. Firth, D. (1993). Bias reduction of maximum likelihood estimates. Biometrika 80, 27–38.MathSciNetzbMATHGoogle Scholar
  17. Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441.zbMATHGoogle Scholar
  18. Friedman, J., Hastie, T. and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22.Google Scholar
  19. Friedman, J., Hastie, T. and Tibshirani, R. (2015). GLASSO: Graphical lasso- estimation of Gaussian graphical models, CRAN R-Package.Google Scholar
  20. Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6, 721–741.zbMATHGoogle Scholar
  21. Guo, N., Wan, Y., Tosun, K., Lin, H., Msiska, Z., Flynn, D., Remick, S., Vallyathan, V., Dowlati, A., Shi, X., Castranova, V., Beer, D. and Qian, Y. (2008). Confirmation of gene expression-based prediction of survival in non-small cell lung cancer. Clin. Cancer Res. 14, 8213–8220.Google Scholar
  22. Hamburg, M. and Collins, F. (2010). The path to personalized medicine. New Engl. J. Med. 363, 301–304.Google Scholar
  23. Hastie, T., Tibshirani, R. and Friedman, J. (2009). The elements of statistical learning. Springer, Berlin.zbMATHGoogle Scholar
  24. Li, R., Lin, D. and Li, B. (2013). Statistical inference in massive data sets. Appl. Stoch. Model. Bus. Ind. 29, 399–409.MathSciNetGoogle Scholar
  25. Li, X., Xu, S., Cheng, Y. and Shu, J. (2016). HSPB1 polymorphisms might be associated with radiation-induced damage risk in lung cancer patients treated with radiotherapy. Tumour Biol. 37, 5743–5749.Google Scholar
  26. Liang, F., Jia, B., Xue, J., Li, Q. and Luo, Y. (2018). An imputation-regularized optimization algorithm for high-dimensional missing data problems and beyond. J. R. Statist. So. Series B 80, 899–926.MathSciNetGoogle Scholar
  27. Liang, F., Song, Q. and Qiu, P. (2015). An Equivalent Measure of Partial Correlation Coefficients for High Dimensional Gaussian Graphical Models. J. Am. Stat. Assoc. 110, 1248–1265.MathSciNetzbMATHGoogle Scholar
  28. Liang, F., Song, Q. and Yu, K. (2013). Bayesian Subset Modeling for High Dimensional Generalized Linear Models. J. Am. Stat. Assoc. 108, 589–606.MathSciNetzbMATHGoogle Scholar
  29. Liu, H., Lafferty, J. and Wasserman, L. (2009). The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs. J. Mach. Learn. Res. 10, 2295–2328.MathSciNetzbMATHGoogle Scholar
  30. Mazumder, R., Friedman, J. and Hastie, T. (2011). SparseNet: Coordinate descent with nonconvex penalties. J. Am. Stat. Assoc. 106, 1125–1138.MathSciNetzbMATHGoogle Scholar
  31. Mazumder, R. and Hastie, T. (2012). The graphical Lasso: New insights and alternatives. Elect J Stat 6, 2125–2149.MathSciNetzbMATHGoogle Scholar
  32. Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34, 1436–1462.MathSciNetzbMATHGoogle Scholar
  33. Peng, J., Zhu, J., Bergamaschi, A., Han, W., Noh, D. -Y., Pollack, J. R. and Wang, P. (2010). Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer. Ann. Appl. Stat. 4, 53–77.MathSciNetzbMATHGoogle Scholar
  34. Peng, Z., Wu, T., Xu, Y., Yan, M. and Yin, W. (2016). Coordinate friendly structures, algorithms and applications. Annals of Mathematical Sciences and Applications 1, 57–119.MathSciNetzbMATHGoogle Scholar
  35. Raskutti, G., Wainwright, M. and Yu, B. (2011). Minimax rates of estimation for high-dimensional linear regression over l qballs. IEEE Trans. Inf. Theory 57, 6976–6994.zbMATHGoogle Scholar
  36. Rothman, A. (2015). MRCE: Multivariate regression with covariance estimation, CRAN R-Package.Google Scholar
  37. Rothman, A., Levina, E. and Zhu, J. (2010). Sparse multivariate regression with covariance estimation. J. Comput. Graph. Stat. 19, 947–962.MathSciNetGoogle Scholar
  38. Sofer, T., Dicker, L. and Lin, X. (2014). Variable selection for high dimensional multivariate outcomes. Stat. Sin. 22, 1633–1654.MathSciNetzbMATHGoogle Scholar
  39. Song, Q. and Liang, F. (2015a). High Dimensional Variable Selection with Reciprocal L1-Regularization. J. Am. Stat. Assoc. 110, 1607–1620.zbMATHGoogle Scholar
  40. Song, Q. and Liang, F. (2015b). A split-and-merge Bayesian variable selection approach for ultra-high dimensional regression. J. R. Stat. Soc. Ser. B 77, 947–972.MathSciNetGoogle Scholar
  41. Tibshirani, R. (1996). Regression shrinkage and selection via the LASSO. J. R. Stat. Soc. Ser. B 58, 267–288.MathSciNetzbMATHGoogle Scholar
  42. Tseng, P. (2001). Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109, 475–494.MathSciNetzbMATHGoogle Scholar
  43. Tseng, P. and Yun, S. (2009). A coordinate gradient descent method for nonsmooth separable minimization. Mathematics Programming, Series B 117, 387–423.MathSciNetzbMATHGoogle Scholar
  44. Turlach, B., Venables, W. and Wright, S. (2005). Simultaneous variable selection. Technometrics 47, 349–363.MathSciNetGoogle Scholar
  45. Vershynin, R. (2015). Estimation in high dimensions: A geometric perspective. Cham, Pfander, G. (ed.), p. 3–66.Google Scholar
  46. Wang, J. (2015). Joint estimation of sparse multivariate regression and conditional graphical models. Stat. Sin. 25, 831–851.MathSciNetzbMATHGoogle Scholar
  47. Weickert, C.E. (2009). Transcriptome analysis of male female differences in prefrontal cortical development. Molecular Psychiatry 14, 558–561.Google Scholar
  48. Witten, D., Friedman, J. and Simon, N. (2011). New insights and faster computations for the graphical Lasso. J. Comput. Graph. Stat. 20, 892–900.MathSciNetGoogle Scholar
  49. Xue, J. and Liang, F. (2019). Double-Parallel Monte Carlo for Bayesian analysis of big data. Statist. Comput. 29, 23–32.MathSciNetGoogle Scholar
  50. Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 95, 19–35.MathSciNetzbMATHGoogle Scholar
  51. Zhang, C. -H. (2010). Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics 38, 894–942.MathSciNetzbMATHGoogle Scholar
  52. Zhang, Y., Duchi, J. and Wainwright, M. (2013). Divide and conquer kernel ridge regression. In Conference on learning theory, pp. 592–617.Google Scholar
  53. Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7, 2541–2563.MathSciNetzbMATHGoogle Scholar
  54. Zou, H. (2006). The adptive lasso and its oracle properties. Ann Statist 38, 894–942.Google Scholar

Copyright information

© Indian Statistical Institute 2019

Authors and Affiliations

  1. 1.Department of StatisticsUniversity of FloridaGainesvilleUSA
  2. 2.Department of StatisticsPurdue UniversityWest LafayetteUSA
  3. 3.Faculty of Business and EconomicsUniversity of Hong KongHong KongChina
  4. 4.University of FloridaGainesvilleUSA

Personalised recommendations