Abstract
Semi-supervised learning has been widely studied in the literature. However, most previous works assume that the output structure is simple enough to allow the direct use of tractable inference/learning algorithms (e.g., binary label or linear chain). Therefore, these methods cannot be applied to problems with complex structure. In this paper, we propose an approximate semi-supervised learning method that uses piecewise training for estimating the model weights and a dual decomposition approach for solving the inference problem of finding the labels of unlabeled data subject to domain specific constraints. This allows us to extend semi-supervised learning to general structured prediction problems. As an example, we apply this approach to the problem of multi-label classification (a fully connected pairwise Markov random field). Experimental results on benchmark data show that, in spite of using approximations, the approach is effective and yields good improvements in generalization performance over the plain supervised method. In addition, we demonstrate that our inference engine can be applied to other semi-supervised learning frameworks, and extends them to solve problems with complex structure.
Chapter PDF
Similar content being viewed by others
References
Brefeld, U., Scheffer, T.: Semi-supervised learning for structured output variables. In: ICML (2006)
Chang, M.W., Ratinov, L.A., Roth, D.: Structured learning with constrained conditional models. Machine Learning 88(3), 399–431 (2012)
Chang, Y.W., Collins, M.: Exact decoding of phrase-based translation models through lagrangian relaxation. In: EMNLP (2011)
Chen, G., Song, Y., Wang, F., Zhang, C.: Semi-supervised multi-label learning by solving a sylvester equation. In: SDM, pp. 410–419 (2008)
Dhillon, P.S., Keerthi, S.S., Bellare, K., Chapelle, O., Sellamanickam, S.: Deterministic annealing for semi-supervised structured output learning. In: AISTATS (2012)
Finley, T., Joachims, T.: Training structural SVMs when exact inference is intractable. In: ICML, pp. 304–311 (2008)
Ganchev, K., Graca, J., Gillenwater, J., Taskar, B.: Posterior regularization for structured latent variable models. JMLR 11 (2010)
Guo, Y., Schuurmans, D.: Semi-supervised multi-label classification - a simultaneous large-margin, subspace learning approach. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part II. LNCS, vol. 7524, pp. 355–370. Springer, Heidelberg (2012)
Hazan, T., Shashua, A.: Norm-product belief propagation: Primal-dual message-passing for approximate inference. CoRR (2009)
Hazan, T., Urtasun, R.: Efficient learning of structured predictors in general graphical models. CoRR (2012)
Huang, S.J., Zhou, Z.H., Zhou, Z.H.: Multi-label learning by exploiting label correlations locally. In: AAAI (2012)
Joachims, T.: Transductive inference for text classification using support vector machines. In: ICML, pp. 200–209. Morgan Kaufmann (1999)
Joachims, T., Finley, T., Yu, C.N.J.: Cutting-plane training of structural SVMs. Machine Learning 77(1), 27–59 (2009)
Jojic, V., Gould, S., Koller, D.: Accelerated dual decomposition for MAP inference. In: ICML (2010)
Komodakis, N.: Efficient training for pairwise or higher order crfs via dual decomposition. In: CVPR (2011)
Komodakis, N., Paragios, N., Tziritas, G.: MRF energy minimization and beyond via dual decomposition. PAMI 33(3), 531–552 (2011)
Koo, T., Rush, A.M., Collins, M., Jaakkola, T., Sontag, D.: Dual decomposition for parsing with non-projective head automata. In: EMNLP (2010)
Kulesza, A., Pereira, F.: Structured learning with approximate inference. In: NIPS (2008)
Lee, C.H., Jiao, F., Wang, S., Schuurmans, D., Greiner, R.: Learning to model spatial dependency: Semi-supervised discriminative random fields. In: NIPS (2006)
Lindsay, B.G.: Composite likelihood methods. Contemporary Mathematics 80, 221–239 (1988)
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(3), 503–528 (1989)
Liu, Y., Jin, R., Yang, L.: Semi-supervised multi-label learning by constrained non-negative matrix factorization. In: AAAI, pp. 421–426 (2006)
Mann, G.S., McCallum, A.: Generalized expectation criteria for semi-supervised learning with weakly labeled data. JMLR 11, 955–984 (2010)
Martins, A.F.T., Figueiredo, M.A.T., Aguiar, P.M.Q., Smith, N.A., Xing, E.P.: Alternating directions dual decomposition. CoRR (2012)
Meshi, O., Globerson, A.: An alternating direction method for dual MAP LP relaxation. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part II. LNCS, vol. 6912, pp. 470–483. Springer, Heidelberg (2011)
Meshi, O., Sontag, D., Jaakkola, T., Globerson, A.: Learning efficiently with approximate inference via dual losses. In: ICML (2010)
Pearl, J.: Probabilistic reasoning in intelligent systems: networks of plausible inference (1988)
Pletscher, P., Wulff, S.: LPQP for MAP: Putting LP solvers to better use. In: ICML (2012)
Samdani, R., Chang, M., Roth, D.: Unified expectation maximization. In: NAACL (2012)
Samdani, R., Roth, D.: Efficient decomposed learning for structured prediction. In: ICML (2012)
Seah, C.W., Tsang, I.W., Ong, Y.S.: Transductive ordinal regression. In: TNNLS, pp. 1074–1086 (2012)
Sutton, C., McCallum, A.: Piecewise training for structured prediction. Machine Learning 77(2-3), 165–194 (2009)
Vedaldi, A.: A MATLAB wrapper of SVMstruct (2011), http://www.vlfeat.org/~vedaldi/code/svm-struct-matlab.html
Xu, L., Wilkinson, D., Schuurmans, D.: Discriminative unsupervised learning of structured predictors. In: ICML (2006)
Yang, Y.: An evaluation of statistical approaches to text categorization. Information Retrieval 1, 69–90 (1999)
Yu, C.N.: Transductive learning of structural SVMs via prior knowledge constraints. In: AISTATS (2012)
Yuille, A.L., Rangarajan, A.: The concave-convex procedure. Neural Computation (2003)
Zha, Z.J., Mei, T., Wang, J., Wang, Z., Hua, X.S.: Graph-based semi-supervised learning with multiple labels. J. Visual Communication and Image Representation 20(2), 97–103 (2009)
Zhang, Y., Schneider, J.: A composite likelihood view for multi-label classification. In: AISTATS (2012)
Zien, A., Brefeld, U., Scheffer, T.: Transductive support vector machines for structured variables. In: ICML (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chang, KW., Sundararajan, S., Keerthi, S.S. (2013). Tractable Semi-supervised Learning of Complex Structured Prediction Models. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2013. Lecture Notes in Computer Science(), vol 8190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40994-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-40994-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40993-6
Online ISBN: 978-3-642-40994-3
eBook Packages: Computer ScienceComputer Science (R0)