Abstract
Machine Learning is a huge and growing area. In this chapter, we cannot possibly even survey this area, but we can provide some context and some connections to probability and statistics that should make it easier to think about machine learning and how to apply these methods to real-world problems. The fundamental problem of statistics is basically the same as machine learning: given some data, how to make it actionable? For statistics, the answer is to construct analytic estimators using powerful theory. For machine learning, the answer is algorithmic prediction. Given a dataset, what forward-looking inferences can we draw? There is a subtle bit in this description: how can we know the future if all we have is data about the past? This is the crux of the matter for machine learning, as we will explore in the chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This assumes that the hypothesis set is big enough to capture the entire training set (which it is for this example). We will discuss this trade-off in greater generality shortly.
- 2.
This is a slight generalization of the classic coupon collector problem.
- 3.
We have also set the random seed to a fixed value to make the figures reproducible in the Jupyter Notebook corresponding to this section.
- 4.
At least up to a rotation of the resulting orthonormal basis.
- 5.
At least up to a rotation of the resulting orthonormal basis.
- 6.
We discussed the geometry of high-dimensional space when we covered the curse of dimensionality in the statistics chapter.
- 7.
Note that these entries are constructed from the data using an estimator of the covariance matrix because we do not have the full probability densities at hand.
- 8.
Note that we are using the init=random keyword argument for this example in order to illustrate this.
References
L. Wasserman, All of Statistics: A Concise Course in Statistical Inference (Springer, Berlin, 2004)
V. Vapnik, The Nature of Statistical Learning Theory. Information Science and Statistics (Springer, Berlin, 2000)
R.E. Schapire, Y. Freund, Boosting Foundations and Algorithms. Adaptive Computation and Machine Learning (MIT Press, Cambridge, 2012)
C. Bauckhage, Numpy/Scipy recipes for data science: Kernel least squares optimization (1) (2015). researchgate.net
W. Richert, Building Machine Learning Systems with Python (Packt Publishing Ltd., Birmingham, 2013)
E. Alpaydin, Introduction to Machine Learning (Wiley Press, New York, 2014)
H. Cuesta, Practical Data Analysis (Packt Publishing Ltd., Birmingham, 2013)
A.J. Izenman, Modern Multivariate Statistical Techniques, vol. 1 (Springer, Berlin, 2008)
A. Hyvärinen, J. Karhunen, E. Oja, Independent Component Analysis, vol. 46 (Wiley, New York, 2004)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Unpingco, J. (2019). Machine Learning. In: Python for Probability, Statistics, and Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-18545-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-18545-9_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18544-2
Online ISBN: 978-3-030-18545-9
eBook Packages: EngineeringEngineering (R0)