Efficient Hold-Out for Subset of Regressors

Pahikkala, Tapio; Suominen, Hanna; Boberg, Jorma; Salakoski, Tapio

doi:10.1007/978-3-642-04921-7_36

Tapio Pahikkala¹⁹,
Hanna Suominen¹⁹,
Jorma Boberg¹⁹ &
…
Tapio Salakoski¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5495))

Included in the following conference series:

International Conference on Adaptive and Natural Computing Algorithms

2061 Accesses
2 Citations

Abstract

Hold-out and cross-validation are among the most useful methods for model selection and performance assessment of machine learning algorithms. In this paper, we present a computationally efficient algorithm for calculating the hold-out performance for sparse regularized least-squares (RLS) in case the method is already trained with the whole training set. The computational complexity of performing the hold-out is , where is the size of the hold-out set and n is the number of basis vectors. The algorithm can thus be used to calculate various types of cross-validation estimates effectively. For example, when m is the number of training examples, the complexities of N-fold and leave-one-out cross-validations are O(m ³/N ² + (m ² n)/N) and O(mn), respectively. Further, since sparse RLS can be trained in O(mn ²) time for several regularization parameter values in parallel, the fast hold-out algorithm enables efficient selection of the optimal parameter value.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Rifkin, R.: Everything Old Is New Again: A Fresh Look at Historical Approaches in Machine Learning. Ph.D thesis, Massachusetts Institute of Technology (2002)
Google Scholar
Saunders, C., Gammerman, A., Vovk, V.: Ridge regression learning algorithm in dual variables. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 515–521. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Google Scholar
Suykens, J.A.K., Vandewalle, J.: Least squares support vector machine classifiers. Neural Processing Letters 9(3), 293–300 (1999)
Article MATH Google Scholar
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2005)
Google Scholar
Pahikkala, T., Pyysalo, S., Boberg, J., Järvinen, J., Salakoski, T.: Matrix representations, linear transformations, and kernels for disambiguation in natural language. Machine Learning 74(2), 133–158 (2009)
Article MATH Google Scholar
Pahikkala, T., Tsivtsivadze, E., Airola, A., Boberg, J., Salakoski, T.: Learning to rank with pairwise regularized least-squares. In: Joachims, T., Li, H., Liu, T.Y., Zhai, C. (eds.) SIGIR 2007 Workshop on Learning to Rank for Information Retrieval, pp. 27–33 (2007)
Google Scholar
Pahikkala, T., Tsivtsivadze, E., Airola, A., Järvinen, J., Boberg, J.: An efficient algorithm for learning to rank from preference graphs. Machine Learning 75(1), 129–165 (2009)
Article Google Scholar
Smola, A.J., Schölkopf, B.: Sparse greedy matrix approximation for machine learning. In: Langley, P. (ed.) Proceedings of the Seventeenth International Conference on Machine Learning, pp. 911–918. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Cawley, G.C., Talbot, N.L.C.: Fast exact leave-one-out cross-validation of sparse least-squares support vector machines. Neural Networks 17(10), 1467–1475 (2004)
Article MATH Google Scholar
Pahikkala, T., Boberg, J., Salakoski, T.: Fast n-fold cross-validation for regularized least-squares. In: Honkela, T., Raiko, T., Kortela, J., Valpola, H. (eds.) Proceedings of the Ninth Scandinavian Conference on Artificial Intelligence (SCAI 2006), Espoo, Finland, Otamedia, pp. 83–90 (2006)
Google Scholar
An, S., Liu, W., Venkatesh, S.: Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression. Pattern Recognition 40(8), 2154–2162 (2007)
Article MATH Google Scholar
Rifkin, R., Lippert, R.: Notes on regularized least squares. Technical Report MIT-CSAIL-TR-2007-025, Massachusetts Institute of Technology (2007)
Google Scholar
Suominen, H., Pahikkala, T., Salakoski, T.: Critical points in assessing learning performance via cross-validation. In: Honkela, T., Pöllä, M., Paukkeri, M.S., Simula, O. (eds.) Proceedings of the 2nd International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning (AKRR 2008), Helsinki University of Technology, pp. 9–22 (2008)
Google Scholar
Quiñonero-Candela, J., Rasmussen, C.E.: A unifying view of sparse approximate gaussian process regression. Journal of Machine Learning Research 6, 1939–1959 (2005)
MathSciNet MATH Google Scholar
Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. In: Helmbold, D., Williamson, R. (eds.) COLT 2001 and EuroCOLT 2001. LNCS, vol. 2111, pp. 416–426. Springer, Heidelberg (2001)
Chapter Google Scholar
Horn, R., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1985)
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Technology, Turku Centre for Computer Science (TUCS), University of Turku, Joukahaisenkatu 3-5 B, FIN-20520, Turku, Finland
Tapio Pahikkala, Hanna Suominen, Jorma Boberg & Tapio Salakoski

Authors

Tapio Pahikkala
View author publications
You can also search for this author in PubMed Google Scholar
Hanna Suominen
View author publications
You can also search for this author in PubMed Google Scholar
Jorma Boberg
View author publications
You can also search for this author in PubMed Google Scholar
Tapio Salakoski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Environmental Sciences, University of Kuopio, PO Box 1627, FIN-70211, Kuopio, Finland
Mikko Kolehmainen
Department of Computer Science, University of Kuopio, P.O.Box 1627, 70211, Kuopio, Finland
Pekka Toivanen
Institute of Control and Industrial Electronics, Warsaw University of Technology, ul. Koszykowa 75, 00-662, Warszawa, Poland
Bartlomiej Beliczynski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pahikkala, T., Suominen, H., Boberg, J., Salakoski, T. (2009). Efficient Hold-Out for Subset of Regressors. In: Kolehmainen, M., Toivanen, P., Beliczynski, B. (eds) Adaptive and Natural Computing Algorithms. ICANNGA 2009. Lecture Notes in Computer Science, vol 5495. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04921-7_36

Download citation

DOI: https://doi.org/10.1007/978-3-642-04921-7_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04920-0
Online ISBN: 978-3-642-04921-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics