Feature Selection via Co-regularized Sparse-Group Lasso

Amaral Santos, Paula L.; Imangaliyev, Sultan; Schutte, Klamer; Levin, Evgeni

doi:10.1007/978-3-319-51469-7_10

Feature Selection via Co-regularized Sparse-Group Lasso

Paula L. Amaral Santos¹⁷,
Sultan Imangaliyev¹⁷,
Klamer Schutte¹⁷ &
…
Evgeni Levin¹⁷

Conference paper
First Online: 25 December 2016

2854 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10122))

Abstract

We propose the co-regularized sparse-group lasso algorithm: a technique that allows the incorporation of auxiliary information into the learning task in terms of “groups” and “distances” among the predictors. The proposed algorithm is particularly suitable for a wide range of biological applications where good predictive performance is required and, in addition to that, it is also important to retrieve all relevant predictors so as to deepen the understanding of the underlying biological process. Our cost function requires related groups of predictors to provide similar contributions to the final response, and thus, guides the feature selection process using auxiliary information. We evaluate the proposed method on a synthetic dataset and examine various settings where its application is beneficial in comparison to the standard lasso, elastic net, group lasso and sparse-group lasso techniques. Last but not least, we make a python implementation of our algorithm available for download and free to use (Available at www.learning-machines.com).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
MATH Google Scholar
Hea, Z., Weichuan Yub, W.: Stable feature selection for biomarker discovery. Comput. Biol. Chem. 34, 215–225 (2010)
Article Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. B 68(part 1), 49–67 (2006)
Article MathSciNet MATH Google Scholar
Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: A sparse group lasso. J.Comput. Graph. Stat. 22(2), 231–245 (2013)
Article MathSciNet Google Scholar
Simon, N., Friedman, J., Hastie, T.: A Blockwise descent algorithm for group-penalized multiresponse and multinomial regression (2013)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: A note on the group lasso and a sparse group lasso (2010)
Google Scholar
Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada (2009)
Google Scholar
Rosselló-Móra, R.: Towards a taxonomy of Bacteria and Archaea based on interactive and cumulative data repositories. Taxon. Biodivers. 14(2), 318–334 (2012)
Google Scholar
Das, J., Gayvert, K.M., Bunea, F., Wegkamp, M.H., Yu, H.: ENCAPP: elastic-net-based prognosis prediction and biomarker discovery for human cancers. BMC Genomics 16(1), 263 (2015)
Article Google Scholar
Zhang, F., Hong, D.: Elastic net-based framework for imaging mass spectrometry data biomarker selection and classification. Stat. Med. 30, 753–768 (2011)
Article MathSciNet Google Scholar
James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning. Morgan Kaufmann, San Francisco (1999)
MATH Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1994)
MathSciNet MATH Google Scholar
Hastie, T., Zou, H.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67(part 2), 301–320 (2005)
MathSciNet MATH Google Scholar
Hoerl, A., Kennard, R.: Ridge regression. In: Encyclopedia of Statistical Sciences, vol. 8, pp. 129–136. Wiley, New York (1988)
Google Scholar
Parikh, N., Boyd, S.: Proximal Algorithms. Now Publishers Inc., Breda (2013). ISBN 978-1601987167
Google Scholar
Ruijter, T., Tsivtsivadze, E., Heskes, T.: Online co-regularized algorithms. In: Ganascia, J.-G., Lenca, P., Petit, J.-M. (eds.) DS 2012. LNCS (LNAI), vol. 7569, pp. 184–193. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33492-4_16
Chapter Google Scholar
Sindhwani, V., Niyogi, P., Belkin, M.: A co-regularization approach to semisupervised learning with multiple views. In: Proceedings of ICML Workshop on Learning with Multiple Views (2005)
Google Scholar

Download references

Acknowledgments

This work was funded by TNO Early Research Program (ERP) “Making sense of big data”.

Author information

Authors and Affiliations

TNO Research, The Hague, The Netherlands
Paula L. Amaral Santos, Sultan Imangaliyev, Klamer Schutte & Evgeni Levin

Authors

Paula L. Amaral Santos
View author publications
You can also search for this author in PubMed Google Scholar
Sultan Imangaliyev
View author publications
You can also search for this author in PubMed Google Scholar
Klamer Schutte
View author publications
You can also search for this author in PubMed Google Scholar
Evgeni Levin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paula L. Amaral Santos .

Editor information

Editors and Affiliations

Department of Industrial and Systems Engineering, University of Florida, Gainesville, Florida, USA
Panos M. Pardalos
Semantic Technology Laboratory, National Research Council (CNR), Catania, Italy
Piero Conca
Dipartimento di Sociologia e Metodi della Ricerca Sociale, Università di Catania, Catania, Italy
Giovanni Giuffrida
Department of Mathematics and Computer Science, University of Catania, Catania, Italy
Giuseppe Nicosia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Amaral Santos, P.L., Imangaliyev, S., Schutte, K., Levin, E. (2016). Feature Selection via Co-regularized Sparse-Group Lasso. In: Pardalos, P., Conca, P., Giuffrida, G., Nicosia, G. (eds) Machine Learning, Optimization, and Big Data. MOD 2016. Lecture Notes in Computer Science(), vol 10122. Springer, Cham. https://doi.org/10.1007/978-3-319-51469-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-51469-7_10
Published: 25 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51468-0
Online ISBN: 978-3-319-51469-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics