Some Notes on Applied Mathematics for Machine Learning

  • Christopher J. C. Burges
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3176)


This chapter describes Lagrange multipliers and some selected subtopics from matrix analysis from a machine learning perspective. The goal is to give a detailed description of a number of mathematical constructions that are widely used in applied machine learning.


Machine Learn Lagrange Multiplier Distance Matrix Maximum Entropy Null Space 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rouse Ball, W.W.: A Short Account of the History of Mathematics, 4th edn. Dover, Mineola (1908)zbMATHGoogle Scholar
  2. 2.
    Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. In: Advances in Neural Information Processing Systems, vol. 14. MIT Press, Cambridge (2002)Google Scholar
  3. 3.
    Bell, E.T.: Men of Mathematics. Simon and Schuster. Touchstone edition (1986) (first published 1937)Google Scholar
  4. 4.
    Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)CrossRefzbMATHGoogle Scholar
  5. 5.
    Buck, B., Macaualay, V. (eds.): Maximum Entropy in Action. Clarendon Press, Oxford (1991)Google Scholar
  6. 6.
    Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2), 121–167 (1998)CrossRefGoogle Scholar
  7. 7.
    Burges, C.J.C.: Geometric Methods for Feature Extraction and Dimensional Reduction. In: Rokach, L., Maimon, O. (eds.) Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers. Kluwer Academic, Dordrecht (to appear, 2004)Google Scholar
  8. 8.
    Cox, T.F., Cox, M.A.A.: Multidimensional Scaling. Chapman and Hall, Sydney (2001)zbMATHGoogle Scholar
  9. 9.
    Cressie, N.A.C.: Statistics for spatial data. Wiley, Chichester (1993) (revised edition)Google Scholar
  10. 10.
    Deerwester, S., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by Latent Semantic Analysis. Journal of the Society for Information Science 41(6), 391–407 (1990)CrossRefGoogle Scholar
  11. 11.
    Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins, Baltimore (1996)zbMATHGoogle Scholar
  12. 12.
    Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1985)CrossRefzbMATHGoogle Scholar
  13. 13.
    Jaynes, E.T.: Bayesian methods: General background. In: Justice, J.H. (ed.) Maximum Entropy and Bayesian Methods in Applied Statistics, pp. 1–25. Cambridge University Press, Cambridge (1985)Google Scholar
  14. 14.
    Kline, M.: Mathematical Thought from Ancient to Modern Times, vol. 1,2,3. Oxford University Press, Oxford (1972)zbMATHGoogle Scholar
  15. 15.
    Mangasarian, O.L.: Nonlinear Programming. McGraw Hill, New York (1969)zbMATHGoogle Scholar
  16. 16.
    Nigam, K., Lafferty, J., McCallum, A.: Using maximum entropy for text classification. In: IJCAI 1999 Workshop on Machine Learning for Information Filtering, pp. 61–67 (1999)Google Scholar
  17. 17.
    Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(22), 2323–2326 (2000)CrossRefGoogle Scholar
  18. 18.
    Schoenberg, I.J.: Remarks to maurice frechet’s article sur la d’efinition axiomatique d’une classe d’espace distanci’es vectoriellement applicable sur l’espace de Hilbert. Annals of Mathematics 36, 724–732 (1935)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. Journal of the Royal Statistical Society 61(3), 611 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Williams, C.K.I.: Prediction with gaussian processes: from linear regression to linear prediction and beyond. In: Jordan, M.I. (ed.) Learning in Graphical Models, pp. 599–621. MIT Press, Cambridge (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Christopher J. C. Burges
    • 1
  1. 1.Microsoft ResearchRedmondUSA

Personalised recommendations