Abstract
Multiplicative update rules have proven useful in many areas of machine learning. Simple to implement, guaranteed to converge, they account in part for the widespread popularity of algorithms such as nonnegative matrix factorization and Expectation-Maximization. In this paper, we show how to derive multiplicative updates for problems in L 1-regularized linear and logistic regression. For L 1–regularized linear regression, the updates are derived by reformulating the required optimization as a problem in nonnegative quadratic programming (NQP). The dual of this problem, itself an instance of NQP, can also be solved using multiplicative updates; moreover, the observed duality gap can be used to bound the error of intermediate solutions. For L 1–regularized logistic regression, we derive similar updates using an iteratively reweighted least squares approach. We present illustrative experimental results and describe efficient implementations for large-scale problems of interest (e.g., with tens of thousands of examples and over one million features).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Boyd, S.P., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Diego, J.M., Tegmark, M., Protopapas, P., Sandvik, H.B.: Combined reconstruction of weak and strong lensing data with WSLAP (2007), doi:10.1111/j.1365-2966.2007.11380.x
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Annals of Statistics 32(2), 407–499 (2004)
Jaakkola, T., Jordan, M.I.: Bayesian parameter estimation via variational methods. Statistics and Computing 10, 25–37 (2000)
Kivinen, J., Warmuth, M.: Exponentiated gradient versus gradient descent for linear predictors. Information and Computation 132(1), 1–63 (1997)
Koh, K., Kim, S.-J., Boyd, S.P.: An interior-point method for large-scale ℓ1–regularized logistic regression. JMLR 8, 1519–1555 (2007)
Krishnapuram, B., Carin, L., Figueiredo, M., Hartemink, A.: Sparse multinomial logistic regression: fast algorithms and generalization bounds. IEEE Transactions on Pattern Analysis and Intelligence 27(6), 957–968 (2005)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Leen, T.K., Dietterich, T.G., Tresp, V. (eds.) Advances in Neural Information Processing Systems, vol. 13, pp. 556–562. MIT Press, Cambridge, MA (2001)
Lee, S., Lee, H., Abbeel, P., Ng, A.Y.: Efficient l 1 regularized logistic regression. In: Proceedings of the Twenty First National Conference on Artificial Intelligence, Boston, MA (2006)
Lin, Y., Lee, D.D.: Bayesian l 1-norm sparse learning. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP-2006), Toulouse, France, vol. V, pp. 605–608 (2006)
McCallum, A.K.: Bow: a toolkit for statistical language modeling, text retrieval, classification and clustering (1996), http://www.cs.cmu.edu/~mccallum/bow
Ng, A.Y.: Feature selection, ℓ1 vs. ℓ2 regularization, and rotational invariance. In: Proceedings of the Twenty First International Conference on Machine Learning (ICML-2004), Banff, Canada, pp. 78–85 (2004)
Sha, F., Lin, Y., Saul, L.K., Lee, D.D.: Multiplicative updates for nonnegative quadratic programming. Neural Computation 19, 2004–2031 (2007)
Sha, F., Saul, L.K., Lee, D.D.: Multiplicative updates for large margin classifiers. In: Proceedings of the Sixteenth Annual Conference on Computational Learning Theory (COLT-2003), Washington D.C., pp. 188–202 (2003)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society B 58(1), 267–288 (1996)
Vapnik, V.: Statistical Learning Theory. Wiley, N.Y. (1998)
Wright, S.J.: Primal-Dual Interior Point Methods. SIAM, Philadelphia, PA (1997)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sha, F., Park, Y.A., Saul, L.K. (2007). Multiplicative Updates for L 1–Regularized Linear and Logistic Regression. In: R. Berthold, M., Shawe-Taylor, J., Lavrač, N. (eds) Advances in Intelligent Data Analysis VII. IDA 2007. Lecture Notes in Computer Science, vol 4723. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74825-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-74825-0_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74824-3
Online ISBN: 978-3-540-74825-0
eBook Packages: Computer ScienceComputer Science (R0)