Skip to main content

The Model Selection Methods for Sparse Biological Networks

  • Conference paper
  • First Online:
Artificial Intelligence and Applied Mathematics in Engineering Problems (ICAIAME 2019)

Abstract

It is still crucial problem to estimate high dimensional graphical models and to choose the regularization parameter in dependent data. There are several classical methods such as Akaike’s information criterion and Bayesian Information criterion to solve this problem, but also more recent methods have been proposed such as stability selection and stability approach to regularization selection method (StARS) and some extensions of AIC and BIC which are more appropriate for high dimensional datasets. In this review, we give some overview about these methods and also give their consistency properties for graphical lasso. Then, we evaluate the performance of these approaches in real datasets. Finally, we propose the theoretical background of our proposal model selection criterion that is based on the KL-divergence and the bootstrapping computation, and is particularly suggested for the sparse biological networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abbruzzo, A., Vujacic, I., Wit, E., Mineo, A.M.: Generalized information criterion for model selection in penalized graphical models. Arxiv (2014)

    Google Scholar 

  2. Akaike, H.: Information theory and an extension of the maximum likelihood priciple. In: Petrov, B.N., Csaki, F. (eds.) Second International Symposium on Information Theory, pp. 267–281. Akademiai Kiad, Budepest (1973)

    Google Scholar 

  3. Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autocontrol 19, 716–723 (1974)

    MathSciNet  MATH  Google Scholar 

  4. Banerjee, O., El Ghaoui, L., d’Aspremont, L.: Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 9, 485–516 (2008)

    MathSciNet  MATH  Google Scholar 

  5. Ayyildiz, E., Ağraz, M., Purutçuoğlu, V.: MARS as an alternative approach of Gaussian graphical model for biochemical networks. J. Appl. Stat. 44c(16), 2858–2876 (2017)

    Article  MathSciNet  Google Scholar 

  6. Bahçivancı, B., Purutçuooğlu, V., Purutçuoğlu, E., Ürün, Y.: Estimation of gynecological cancer networks via target proteins. J. Multidiscip. Eng. Sci. 5(12), 9296–9302 (2018)

    Google Scholar 

  7. Bogdan, M., Ghosh, J.K., Doerge, R.W.: Modifying the Schwarz Bayesian information criterion to locate multiple interacting quantitative trait loci. Genetics 167, 989–999 (2004)

    Article  Google Scholar 

  8. Boltzmann, L.: Uber die Beziehung zwischen dem zweiten Hauptsatze dewr mechanischen Warmetheorie und der Wahrscheinlichkeitsrechnung, respective den Satzenuber das Warmegleichgewicht. Weiner Bericte 76, 373–435 (1877)

    Google Scholar 

  9. Boyd, S., Vanderberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)

    Book  Google Scholar 

  10. Bozdogan, H.: Model selection and AIC: the general theory and its analytical extensions. Pscychometrica 52(3), 345–370 (1987)

    Article  MathSciNet  Google Scholar 

  11. Bozdogan, H.: A new class of information complexity (ICOMP) criteria with an application to costumer profiling and segmentation. Istanbul Univ. J. Sch. Bus. Adm. 39(2), 370–398 (2010)

    Google Scholar 

  12. Bülbül, G.B., Purutçuoğlu, V., Purutçuoğlu, E.: Novel model selection criteria on sparse biological networks. Int. J. Environ. Sci. Technol. 16, 1–12 (2019)

    Article  Google Scholar 

  13. Cavanaugh, J.E., Shumway, R.H.: A bootstrap variant of AIC for state-space model selection. Stat. Sin. 7, 473–496 (1997)

    MathSciNet  MATH  Google Scholar 

  14. Chen, J., Chen, Z.: Extended Bayesian information criterian for model selection with large model space. Biometrika 95, 759–771 (2008)

    Article  MathSciNet  Google Scholar 

  15. Chen, J., Chen, Z.: Extended BIC for small-n-large-p sparse GLM. Stat. Sin. 22, 555–574 (2011)

    MathSciNet  MATH  Google Scholar 

  16. Claeskans, G., Hjort, N.L.: Model Selection and Model Everaging. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge (2008)

    Google Scholar 

  17. Dempster, A.: Covariance selection. Biometrics 28, 157–175 (1972)

    Article  MathSciNet  Google Scholar 

  18. Dobra, A., Lenkoski, A.: Copula Gaussian graphical models and their application to modeling functional disability data. Ann. Appl. Stat. 5(2A), 969–993 (2011)

    Article  MathSciNet  Google Scholar 

  19. Efron, B.: The Jackknife, The Bootstrap and Other Resampling Plans. SIAM [Society for Industrial and Applied Mathematics], Philadelphia (1982)

    Book  Google Scholar 

  20. Foygel, R., Drton, M.: Extended Bayesian information criteria for Gaussian graphical models. In: Advances in Neural Information Processing Systems, vol. 23, pp. 2020–2028 (2010)

    Google Scholar 

  21. Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441 (2007)

    Article  Google Scholar 

  22. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer Verlag, New York (2009)

    Book  Google Scholar 

  23. Hurvich, C.M., Tsai, C.L.: A corrected Akaike information criterion for vector autoregressive model selection. J. Time Ser. Anal. 14, 271–279 (1993)

    Article  MathSciNet  Google Scholar 

  24. Lim, C., Yu, B.: Estimation stability with cross-validation. J. Comput. Graph. Stat. 25(2), 464–492 (2016)

    Article  MathSciNet  Google Scholar 

  25. Liu, H., Roeder, K., Wasserman, L.: Stability approach to regulazation selection (STARS) for high dimensional graphical models. In: Proceeding of the Twenty-Third Annual Conference on Neural Information Processing System (NIPS), pp. 1–14 (2010)

    Google Scholar 

  26. Meinhausen, N., Buhlmann, P.: High dimensional graphs and variable selection with lasso. Ann. Stat. 34, 1436–1462 (2006)

    Article  MathSciNet  Google Scholar 

  27. Meinhausen, N., Bühlmann, P.: Stability selection. J. Roy. Stat. Soc. Ser. A 72, 417–473 (2010)

    Article  MathSciNet  Google Scholar 

  28. Müller, C.L., Bonneau, R., Kurtz, Z.D.: Generalized stability approach for regularized graphical models. Arxiv (2016)

    Google Scholar 

  29. Politis, D.N., Romano, J.P., Wolf, M.: Subsampling. Springer, Heidelberg (1999)

    Book  Google Scholar 

  30. Schwartz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)

    Article  MathSciNet  Google Scholar 

  31. Shah, R.D., Samworth, R.J.: Variable selection with error control: another look at stability selection. J. Roy. Stat. Soc. B 1(1), 55–80 (2013)

    Article  MathSciNet  Google Scholar 

  32. Shibata, R.: Bootstrap estimate of Kullback-Leibler information for model selection. Stat. Sin. 7(2), 375–394 (1997)

    MathSciNet  MATH  Google Scholar 

  33. Sugiura, N.: Further analysis of the data by Akaike’s information criterion and the finite correction. Commun. Stat. Theory Methods A7, 13–26 (1978)

    Article  Google Scholar 

  34. Yuan, M., Lin, Y.: Model selection and estimation in Gaussian graphical model. Biometrika 94, 19–35 (2007)

    Article  MathSciNet  Google Scholar 

  35. Zhao, P., Yu, B.: On model selection consistency of lasso. J. Mach. Learn. Res. 7, 2541–2563 (2006)

    MathSciNet  MATH  Google Scholar 

  36. Whittaker, J.: Graphical Models in Applied Multivariate Statistics. Wiley, New York (1990)

    MATH  Google Scholar 

Download references

Acknowledgement

The authors thank to Ms Gül Bahar Bülbül for her help while preparing the tables.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vilda Purutçuoğlu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kaygusuz, M.A., Purutçuoğlu, V. (2020). The Model Selection Methods for Sparse Biological Networks. In: Hemanth, D., Kose, U. (eds) Artificial Intelligence and Applied Mathematics in Engineering Problems. ICAIAME 2019. Lecture Notes on Data Engineering and Communications Technologies, vol 43. Springer, Cham. https://doi.org/10.1007/978-3-030-36178-5_10

Download citation

Publish with us

Policies and ethics