Semi-supervised Linear Discriminant Analysis Using Moment Constraints

  • Marco Loog
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7081)


A semi-supervised version of Fisher’s linear discriminant analysis is presented. As opposed to virtually all other approaches to semi-supervision, no assumptions on the data distribution are made, apart from the ones explicitly or implicitly present in standard supervised learning. Our approach exploits the fact that the parameters that are to be estimated in linear discriminant analysis fulfill particular relations that link label-dependent with label-independent quantities. In this way, the later type of parameters, which can be estimated based on unlabeled data, impose constraints on the former and lead to a reduction in variability of the label dependent estimates. As a result, the performance of our semi-supervised linear discriminant is expected to improve over that of its supervised equal and typically does not deteriorate with increasing numbers of unlabeled data.


Linear Discriminant Analysis Unlabeled Data Unlabeled Instance Moment Constraint Semisupervised Learning 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abney, S.: Understanding the Yarowsky algorithm. Computational Linguistics 30(3), 365–395 (2004)CrossRefzbMATHMathSciNetGoogle Scholar
  2. 2.
    Asuncion, A., Newman, D.: UCI machine learning repository (2007),
  3. 3.
    Basu, S., Banerjee, A., Mooney, R.: Semi-supervised clustering by seeding. In: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 19–26 (2002)Google Scholar
  4. 4.
    Baudat, G., Anouar, F.: Generalized discriminant analysis using a kernel approach. Neural Computation 12(10), 2385–2404 (2000)CrossRefGoogle Scholar
  5. 5.
    Bengio, Y., Delalleau, O., Le Roux, N.: Label propagation and quadratic criterion. In: Semi-Supervised Learning, ch. 11. MIT Press (2006)Google Scholar
  6. 6.
    Chapelle, O., Schölkopf, B., Zien, A.: Introduction to semi-supervised learning. In: Semi-Supervised Learning, ch. 1. MIT Press (2006)Google Scholar
  7. 7.
    Cohen, I., Cozman, F., Sebe, N., Cirelo, M., Huang, T.: Semisupervised learning of classifiers: Theory, algorithms, and their application to human-computer interaction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1553–1567 (2004)Google Scholar
  8. 8.
    Cozman, F., Cohen, I.: Risks of semi-supervised learning. In: Semi-Supervised Learning, ch. 4. MIT Press (2006)Google Scholar
  9. 9.
    Culp, M., Michailidis, G.: An iterative algorithm for extending learners to a semi-supervised setting. Journal of Computational and Graphical Statistics 17(3), 545–571 (2008)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Fan, B., Lei, Z., Li, S.: Normalized LDA for semi-supervised learning. In: 8th IEEE International Conference on Automatic Face & Gesture Recognition, pp. 1–6. IEEE (2009)Google Scholar
  11. 11.
    Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press (1990)Google Scholar
  12. 12.
    Hartley, H., Rao, J.: Classification and estimation in analysis of variance problems. Review of the International Statistical Institute 36(2), 141–147 (1968)CrossRefzbMATHMathSciNetGoogle Scholar
  13. 13.
    Hastie, T., Buja, A., Tibshirani, R.: Penalized discriminant analysis. The Annals of Statistics 23(1), 73–102 (1995)CrossRefzbMATHMathSciNetGoogle Scholar
  14. 14.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, Heidelberg (2001)CrossRefzbMATHGoogle Scholar
  15. 15.
    Loog, M.: Constrained Parameter Estimation for Semi-Supervised Learning: The Case of the Nearest Mean Classifier. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6322, pp. 291–304. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  16. 16.
    McLachlan, G.: Iterative reclassification procedure for constructing an asymptotically optimal rule of allocation in discriminant analysis. Journal of the American Statistical Association 70(350), 365–369 (1975)CrossRefzbMATHMathSciNetGoogle Scholar
  17. 17.
    McLachlan, G.: Estimating the linear discriminant function from initial samples containing a small number of unclassified observations. Journal of the American Statistical Association 72(358), 403–406 (1977)CrossRefzbMATHMathSciNetGoogle Scholar
  18. 18.
    McLachlan, G.: Discriminant Analysis and Statistical Pattern Recognition. John Wiley & Sons (1992)Google Scholar
  19. 19.
    Nigam, K., McCallum, A., Thrun, S., Mitchell, T.: Learning to classify text from labeled and unlabeled documents. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence, pp. 792–799 (1998)Google Scholar
  20. 20.
    Seeger, M.: A taxonomy for semi-supervised learning methods. In: Semi-Supervised Learning, ch. 2. MIT Press (2006)Google Scholar
  21. 21.
    Szummer, M., Jaakkola, T.: Partially labeled classification with Markov random walks. In: Advances in Neural Information Processing Systems, vol. 2, pp. 945–952 (2002)Google Scholar
  22. 22.
    Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics, pp. 189–196 (1995)Google Scholar
  23. 23.
    Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Tech. Rep. CMU-CALD-02-107. Carnegie Mellon University (2002)Google Scholar
  24. 24.
    Zhu, X., Goldberg, A.: Introduction to Semi-Supervised Learning. Morgan & Claypool Publishers (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Marco Loog
    • 1
  1. 1.Pattern Recognition LaboratoryDelft University of TechnologyThe Netherlands

Personalised recommendations