Abstract
We propose the co-regularized sparse-group lasso algorithm: a technique that allows the incorporation of auxiliary information into the learning task in terms of “groups” and “distances” among the predictors. The proposed algorithm is particularly suitable for a wide range of biological applications where good predictive performance is required and, in addition to that, it is also important to retrieve all relevant predictors so as to deepen the understanding of the underlying biological process. Our cost function requires related groups of predictors to provide similar contributions to the final response, and thus, guides the feature selection process using auxiliary information. We evaluate the proposed method on a synthetic dataset and examine various settings where its application is beneficial in comparison to the standard lasso, elastic net, group lasso and sparse-group lasso techniques. Last but not least, we make a python implementation of our algorithm available for download and free to use (Available at www.learning-machines.com).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Hea, Z., Weichuan Yub, W.: Stable feature selection for biomarker discovery. Comput. Biol. Chem. 34, 215–225 (2010)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. B 68(part 1), 49–67 (2006)
Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: A sparse group lasso. J.Comput. Graph. Stat. 22(2), 231–245 (2013)
Simon, N., Friedman, J., Hastie, T.: A Blockwise descent algorithm for group-penalized multiresponse and multinomial regression (2013)
Friedman, J., Hastie, T., Tibshirani, R.: A note on the group lasso and a sparse group lasso (2010)
Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada (2009)
Rosselló-Móra, R.: Towards a taxonomy of Bacteria and Archaea based on interactive and cumulative data repositories. Taxon. Biodivers. 14(2), 318–334 (2012)
Das, J., Gayvert, K.M., Bunea, F., Wegkamp, M.H., Yu, H.: ENCAPP: elastic-net-based prognosis prediction and biomarker discovery for human cancers. BMC Genomics 16(1), 263 (2015)
Zhang, F., Hong, D.: Elastic net-based framework for imaging mass spectrometry data biomarker selection and classification. Stat. Med. 30, 753–768 (2011)
James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning. Morgan Kaufmann, San Francisco (1999)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1994)
Hastie, T., Zou, H.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67(part 2), 301–320 (2005)
Hoerl, A., Kennard, R.: Ridge regression. In: Encyclopedia of Statistical Sciences, vol. 8, pp. 129–136. Wiley, New York (1988)
Parikh, N., Boyd, S.: Proximal Algorithms. Now Publishers Inc., Breda (2013). ISBN 978-1601987167
Ruijter, T., Tsivtsivadze, E., Heskes, T.: Online co-regularized algorithms. In: Ganascia, J.-G., Lenca, P., Petit, J.-M. (eds.) DS 2012. LNCS (LNAI), vol. 7569, pp. 184–193. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33492-4_16
Sindhwani, V., Niyogi, P., Belkin, M.: A co-regularization approach to semisupervised learning with multiple views. In: Proceedings of ICML Workshop on Learning with Multiple Views (2005)
Acknowledgments
This work was funded by TNO Early Research Program (ERP) “Making sense of big data”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Amaral Santos, P.L., Imangaliyev, S., Schutte, K., Levin, E. (2016). Feature Selection via Co-regularized Sparse-Group Lasso. In: Pardalos, P., Conca, P., Giuffrida, G., Nicosia, G. (eds) Machine Learning, Optimization, and Big Data. MOD 2016. Lecture Notes in Computer Science(), vol 10122. Springer, Cham. https://doi.org/10.1007/978-3-319-51469-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-51469-7_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51468-0
Online ISBN: 978-3-319-51469-7
eBook Packages: Computer ScienceComputer Science (R0)