Abstract
A fully Bayesian framework for sparse regression in generalized linear models is introduced. Assuming that a natural group structure exists on the domain of predictor variables, sparsity conditions are applied to these variable groups in order to be able to explain the observations with simple and interpretable models. We introduce a general family of distributions which imposes a flexible amount of sparsity on variable groups. This model overcomes the problems associated with insufficient sparsity of traditional selection methods in high-dimensional spaces. The fully Bayesian inference mechanism allows us to quantify the uncertainty in the regression coefficient estimates. The general nature of the framework makes it applicable to a wide variety of generalized linear models with minimal modifications.An efficient MCMC algorithm is presented to sample from the posterior. Simulated experiments validate the strength of this new class of sparse regression models. When applied to the problem of splice site prediction on DNA sequence data, the method identifies key interaction terms of sequence positions which help in identifying “true” splice sites.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc. B 58(1), 267–288 (1996)
Figueiredo, M., Jain, A.: Bayesian learning of sparse classifiers. In: Proc. IEEE Comp. Soc. Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 35–41 (2001)
Park, T., Casella, G.: The Bayesian Lasso. Journal of the American Statistical Association 103, 681–686 (2008)
Meinshausen, N.: Relaxed lasso. Computational Statistics & Data Analysis 52(1), 374–393 (2007)
Caron, F., Doucet, A.: Sparse bayesian nonparametric regression. In: ICML 2008, pp. 88–95. ACM Press, New York (2008)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. Roy. Stat. Soc. B, 49–67 (2006)
Meier, L., van de Geer, S., Bühlmann, P.: The Group Lasso for Logistic Regression. J. Roy. Stat. Soc. B 70(1), 53–71 (2008)
Roth, V., Fischer, B.: The Group-Lasso for generalized linear models: uniqueness of solutions and efficient algorithms. In: ICML 2008, pp. 848–855. ACM, New York (2008)
McCullaghand, P., Nelder, J.A.: Generalized Linear Models. Chapman and Hall, Boca Raton (1983)
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis. Chapman and Hall, Boca Raton (1995)
Fink, D.: A compendium of conjugate priors. in progress report: Extension and enhancement of methods for setting data quality objectives. Technical Report (1995)
Everitt, B.S.: The Analysis of Contingency Tables. Chapman and Hall, Boca Raton (1997)
Green, P.E., Park, T.: Bayesian methods for contingency tables using Gibbs sampling. Statistical Papers 45(1), 33–50 (2004)
Raftery, A.E., Lewis, S.M.: One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo. Statistical Science 7, 493–497 (1992)
Yeo, G., Burge, C.B.: Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J. Comp. Biology 11, 377–394 (2004)
Seshadri, V.: The inverse Gaussian distribution: a case study in exponential families. Clarendon Press, Oxford (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Raman, S., Roth, V. (2009). Sparse Bayesian Regression for Grouped Variables in Generalized Linear Models. In: Denzler, J., Notni, G., Süße, H. (eds) Pattern Recognition. DAGM 2009. Lecture Notes in Computer Science, vol 5748. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03798-6_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-03798-6_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03797-9
Online ISBN: 978-3-642-03798-6
eBook Packages: Computer ScienceComputer Science (R0)