Abstract
Chapter 6 discussed regression models that were intrinsically linear. In this chapter we present regression models that are inherently nonlinear in nature. When using these models, the exact form of the nonlinearity does not need to be known explicitly or specified prior to model training. These models include neural networks (Section 7.1), multivariate adaptive regression splines (Section 7.2), support vector machines (Section 7.3), and K-nearest neighbors (Section 7.4). In the Computing Section (7.5) we demonstrate how to train each of these models in R. Finally, exercises are provided at the end of the chapter to solidify the concepts.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
The penalty here is written as the reverse of ridge regression or weight decay in neural networks since it is attached to residuals and not the parameters.
References
Ambroise C, McLachlan G (2002). “Selection Bias in Gene Extraction on the Basis of Microarray Gene–Expression Data.” Proceedings of the National Academy of Sciences, 99(10), 6562–6566.
Bentley J (1975). “Multidimensional Binary Search Trees Used for Associative Searching.” Communications of the ACM, 18(9), 509–517.
Bergmeir C, Benitez JM (2012). “Neural Networks in R Using the Stuttgart Neural Network Simulator: RSNNS.” Journal of Statistical Software, 46(7), 1–26.
Bishop C (1995). Neural Networks for Pattern Recognition. Oxford University Press, Oxford.
Caputo B, Sim K, Furesjo F, Smola A (2002). “Appearance–Based Object Recognition Using SVMs: Which Kernel Should I Use?” In “Proceedings of NIPS Workshop on Statistical Methods for Computational Experiments in Visual Processing and Computer Vision,”.
Chang CC, Lin CJ (2011). “LIBSVM: A Library for Support Vector Machines.” ACM Transactions on Intelligent Systems and Technology, 2, 27: 1–27:27.
Drucker H, Burges C, Kaufman L, Smola A, Vapnik V (1997). “Support Vector Regression Machines.” Advances in Neural Information Processing Systems, pp. 155–161.
Friedman J (1991). “Multivariate Adaptive Regression Splines.” The Annals of Statistics, 19(1), 1–141.
Golub G, Heath M, Wahba G (1979). “Generalized Cross–Validation as a Method for Choosing a Good Ridge Parameter.” Technometrics, 21(2), 215–223.
Karatzoglou A, Smola A, Hornik K, Zeileis A (2004). “kernlab - An S4 Package for Kernel Methods in R.” Journal of Statistical Software, 11(9), 1–20.
Kohonen T (1995). Self–Organizing Maps. Springer.
Liu B (2007). Web Data Mining. Springer Berlin / Heidelberg.
McCarren P, Springer C, Whitehead L (2011). “An Investigation into Pharmaceutically Relevant Mutagenicity Data and the Influence on Ames Predictive Potential.” Journal of Cheminformatics, 3(51).
Melssen W, Wehrens R, Buydens L (2006). “Supervised Kohonen Networks for Classification Problems.” Chemometrics and Intelligent Laboratory Systems, 83(2), 99–113.
Neal R (1996). Bayesian Learning for Neural Networks. Springer-Verlag.
Perrone M, Cooper L (1993). “When Networks Disagree: Ensemble Methods for Hybrid Neural Networks.” In RJ Mammone (ed.), “Artificial Neural Networks for Speech and Vision,” pp. 126–142. Chapman & Hall, London.
Ripley B (1995). “Statistical Ideas for Selecting Network Architectures.” Neural Networks: Artificial Intelligence and Industrial Applications, pp. 183–190.
Ripley B (1996). Pattern Recognition and Neural Networks. Cambridge University Press.
Rumelhart D, Hinton G, Williams R (1986). “Learning Internal Representations by Error Propagation.” In “Parallel Distributed Processing: Explorations in the Microstructure of Cognition,” The MIT Press.
Smola A (1996). “Regression Estimation with Support Vector Learning Machines.” Master’s thesis, Technische Universit at Munchen.
Tipping M (2001). “Sparse Bayesian Learning and the Relevance Vector Machine.” Journal of Machine Learning Research, 1, 211–244.
Titterington M (2010). “Neural Networks.” Wiley Interdisciplinary Reviews: Computational Statistics, 2(1), 1–8.
Tumer K, Ghosh J (1996). “Analysis of Decision Boundaries in Linearly Combined Neural Classifiers.” Pattern Recognition, 29(2), 341–348.
Wang C, Venkatesh S (1984). “Optimal Stopping and Effective Machine Complexity in Learning.” Advances in NIPS, pp. 303–310.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Kuhn, M., Johnson, K. (2013). Nonlinear Regression Models. In: Applied Predictive Modeling. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6849-3_7
Download citation
DOI: https://doi.org/10.1007/978-1-4614-6849-3_7
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6848-6
Online ISBN: 978-1-4614-6849-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)