Linear Algebra and Optimization: An Introduction



Machine learning builds mathematical models from data containing multiple attributes (i.e., variables) in order to predict some variables from others. For example, in a cancer prediction application, each data point might contain the variables obtained from running clinical tests, whereas the predicted variable might be a binary diagnosis of cancer. Such models are sometimes expressed as linear and nonlinear relationships between variables. These relationships are discovered in a data-driven manner by optimizing (maximizing) the “agreement” between the models and the observed data. This is an optimization problem.


  1. 1.
    C. Aggarwal. Data mining: The textbook. Springer, 2015.zbMATHGoogle Scholar
  2. 2.
    C. Aggarwal. Machine learning for text. Springer, 2018.CrossRefGoogle Scholar
  3. 3.
    C. Aggarwal. Recommender systems: The textbook. Springer, 2016.CrossRefGoogle Scholar
  4. 4.
    C. Aggarwal. Outlier analysis. Springer, 2017.CrossRefGoogle Scholar
  5. 10.
    M. Bazaraa, H. Sherali, and C. Shetty. Nonlinear programming: theory and algorithms. John Wiley and Sons, 2013.zbMATHGoogle Scholar
  6. 15.
    D. Bertsekas. Nonlinear programming. Athena scientific, 1999.Google Scholar
  7. 16.
    D. Bertsimas and J. Tsitsiklis. Introduction to linear optimization. Athena Scientific, 1997.Google Scholar
  8. 18.
    C. M. Bishop. Pattern recognition and machine learning. Springer, 2007.zbMATHGoogle Scholar
  9. 19.
    C. M. Bishop. Neural networks for pattern recognition. Oxford University Press, 1995.zbMATHGoogle Scholar
  10. 22.
    S. Boyd and L. Vandenberghe. Convex optimization. Cambridge University Press, 2004.CrossRefGoogle Scholar
  11. 23.
    S. Boyd and L. Vandenberghe. Applied linear algebra. Cambridge University Press, 2018.zbMATHGoogle Scholar
  12. 39.
    R. Duda, P. Hart, and D. Stork. Pattern classification. John Wiley and Sons, 2012.zbMATHGoogle Scholar
  13. 46.
    P. Flach. Machine learning: the art and science of algorithms that make sense of data. Cambridge University Press, 2012.CrossRefGoogle Scholar
  14. 52.
    G. Golub and C. F. Van Loan. Matrix computations, John Hopkins University Press, 2012.zbMATHGoogle Scholar
  15. 53.
    I. Goodfellow, Y. Bengio, and A. Courville. Deep learning. MIT Press, 2016.zbMATHGoogle Scholar
  16. 56.
    T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning. Springer, 2009.CrossRefGoogle Scholar
  17. 62.
    K. Hoffman and R. Kunze. Linear algebra, Second Edition, Pearson, 1975.Google Scholar
  18. 77.
    D. Lay, S. Lay, and J. McDonald. Linear Algebra and its applications, Pearson, 2012.Google Scholar
  19. 85.
    S. Marsland. Machine learning: An algorithmic perspective, CRC Press, 2015.Google Scholar
  20. 94.
    T. Mitchell. Machine learning, McGraw Hill, 1997.zbMATHGoogle Scholar
  21. 95.
    K. Murphy. Machine learning: A probabilistic perspective, MIT Press, 2012.zbMATHGoogle Scholar
  22. 99.
    J. Nocedal and S. Wright. Numerical optimization. Springer, 2006.zbMATHGoogle Scholar
  23. 119.
    J. Solomon. Numerical Algorithms: Methods for Computer Vision, Machine Learning, and Graphics. CRC Press, 2015.CrossRefGoogle Scholar
  24. 122.
    G. Strang. An introduction to linear algebra, Fifth Edition. Wellseley-Cambridge Press, 2016.zbMATHGoogle Scholar
  25. 123.
    G. Strang. Linear algebra and its applications, Fourth Edition. Brooks Cole, 2011.Google Scholar
  26. 125.
    G. Strang. Linear algebra and learning from data. Wellesley-Cambridge Press, 2019.zbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.IBM T.J. Watson Research CenterYorktown HeightsUSA

Personalised recommendations