Machine Learning

Unpingco, José

doi:10.1007/978-3-030-18545-9_4

José Unpingco²

14k Accesses
2 Citations

Abstract

Machine Learning is a huge and growing area. In this chapter, we cannot possibly even survey this area, but we can provide some context and some connections to probability and statistics that should make it easier to think about machine learning and how to apply these methods to real-world problems. The fundamental problem of statistics is basically the same as machine learning: given some data, how to make it actionable? For statistics, the answer is to construct analytic estimators using powerful theory. For machine learning, the answer is algorithmic prediction. Given a dataset, what forward-looking inferences can we draw? There is a subtle bit in this description: how can we know the future if all we have is data about the past? This is the crux of the matter for machine learning, as we will explore in the chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This assumes that the hypothesis set is big enough to capture the entire training set (which it is for this example). We will discuss this trade-off in greater generality shortly.
2.
This is a slight generalization of the classic coupon collector problem.
3.
We have also set the random seed to a fixed value to make the figures reproducible in the Jupyter Notebook corresponding to this section.
4.
At least up to a rotation of the resulting orthonormal basis.
5.
At least up to a rotation of the resulting orthonormal basis.
6.
We discussed the geometry of high-dimensional space when we covered the curse of dimensionality in the statistics chapter.
7.
Note that these entries are constructed from the data using an estimator of the covariance matrix because we do not have the full probability densities at hand.
8.
Note that we are using the init=random keyword argument for this example in order to illustrate this.

References

L. Wasserman, All of Statistics: A Concise Course in Statistical Inference (Springer, Berlin, 2004)
Google Scholar
V. Vapnik, The Nature of Statistical Learning Theory. Information Science and Statistics (Springer, Berlin, 2000)
Google Scholar
R.E. Schapire, Y. Freund, Boosting Foundations and Algorithms. Adaptive Computation and Machine Learning (MIT Press, Cambridge, 2012)
Google Scholar
C. Bauckhage, Numpy/Scipy recipes for data science: Kernel least squares optimization (1) (2015). researchgate.net
Google Scholar
W. Richert, Building Machine Learning Systems with Python (Packt Publishing Ltd., Birmingham, 2013)
Google Scholar
E. Alpaydin, Introduction to Machine Learning (Wiley Press, New York, 2014)
Google Scholar
H. Cuesta, Practical Data Analysis (Packt Publishing Ltd., Birmingham, 2013)
Google Scholar
A.J. Izenman, Modern Multivariate Statistical Techniques, vol. 1 (Springer, Berlin, 2008)
Book Google Scholar
A. Hyvärinen, J. Karhunen, E. Oja, Independent Component Analysis, vol. 46 (Wiley, New York, 2004)
Google Scholar

Download references

Author information

Authors and Affiliations

San Diego, CA, USA
José Unpingco

Authors

José Unpingco
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Unpingco, J. (2019). Machine Learning. In: Python for Probability, Statistics, and Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-18545-9_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-18545-9_4
Published: 29 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18544-2
Online ISBN: 978-3-030-18545-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics