Abstract
Conditional random fields are among the state-of-the art approaches to structured output prediction, and the model has been adopted for various real-world problems. The supervised classification is expensive, since it is usually expensive to produce labelled data. Unlabeled data are relatively cheap, but how to use it? Unlabeled data can be used to estimate marginal probability of observations, and we exploit this idea in our work.
Introduction of unlabeled data and of probability of observations into a purely discriminative model is a challenging task.
We consider an extrapolation of a recently proposed semi-supervised criterion to the model of conditional random fields, and show its drawbacks. We discuss alternative usage of the marginal probability and propose a pool-based active learning approach based on quota sampling. We carry out experiments on synthetic as well as on standard natural language data sets, and we show that the proposed quota sampling active learning method is efficient.
Chapter PDF
References
Altun, Y., McAllester, D., Belkin, M.: Maximum margin semi-supervised learning for structured variables. In: NIPS (2005)
Bouchard, G., Triggs, B.: The trade-off between generative and discriminative classifiers. In: IASC (2004)
Brefeld, U., Scheffer, T.: Semi-supervised learning for structured output variables. In: ICML (2006)
Chapelle, O., Schölkopf, B., Zien, A.: Semi-Supervised Learning. MIT Press, Cambridge (2006)
Daumé III, H.: Semi-supervised or semi-unsupervised? In: NAACL Workshop on Semi-supervised Learning for NLP (2009)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of The Royal Statistical Society Series B 39(1), 1–38 (1977)
Goodman, J.T.: A bit of progress in language modeling. Technical Report MSR-TR-2001-72, Microsoft Research, Redmond (August 2001)
Grandvalet, Y., Bengio, Y.: Semi-supervised learning by entropy minimization. In: NIPS (2004)
Holub, A., Perona, P.: A discriminative framework for modelling object classes. In: CVPR 2005 (2005)
Jiao, F., Wang, S., Lee, C.H., Greiner, R., Schuurmans, D.: Semi-supervised conditional random fields for improved sequence segmentation and labeling. In: ACL/COLING (2006)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML (2001)
Lasserre, J.A., Bishop, C.M., Minka, T.P.: Principled hybrids of generative and discriminative models. In: CVPR (2006)
Mann, G., McCallum, A.: Efficient computation of entropy gradient for semi-supervised conditional random fields. In: NAACL/HLT (2007)
Mann, G., McCallum, A.: Simple, robust, scalable semi-supervised learning via expectation regularization. In: ICML (2007)
Mann, G., McCallum, A.: Generalized expectation criteria for semi-supervised learning of conditional random fields. In: ACL (2008)
Minka, T.: Discriminative models, not discriminative training. Technical Report TR-2005-144, Microsoft Cambridge (2005)
Ng, A., Jordan, M.: On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes. In: NIPS (2002)
Qi, Y., Kuksa, P.P., Collobert, R., Kavukcuoglu, K., Weston, J.: Semi-supervised sequence labelling with self-learned feature. In: ICDM (2009)
Quattoni, A., Collins, M., Darrell, T.: Conditional random fields for object recognition. In: NIPS (2004)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)
Scudder, H.J.: Probability of error of some adaptive pattern-recognition machines. IEEE Transactions on Information Theory 11, 363–371 (1965)
Seeger, M.: Learning with labeled and unlabeled data. Technical report, University of Edinburgh, Institute for Adaptive and Neural Computation (2002)
Sejnowski, T.J., Rosenberg, C.R.: Parallel networks that learn to pronounce english text. Complex Systems 1 (1987)
Shimodaira, H.: Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference 90, 227–244 (2000)
Sokolovska, N.: Contributions to estimation of probabilistic discriminative models: semi-supervised learning and feature selection. PhD thesis, TELECOM ParisTech (2010)
Sokolovska, N., Cappé, O., Yvon, F.: The asymptotics of semi-supervised learning in discriminative probabilistic models. In: ICML (2008)
Sutton, C., McCallum, A.: An introduction to conditional random fields for relational learning. In: Getoor, L., Taskar, B. (eds.) Introduction to Statistical Relational Learning. The MIT Press, Cambridge (2006)
Suzuki, J., Fujino, A., Isozaki, H.: Semi-supervised structured output learning based on a hybrid generative and discriminative approach. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (2007)
Suzuki, J., Isozaki, H.: Semi-supervised sequential labeling and segmentation using giga-word scale unlabeled data. In: ACL (2008)
Suzuki, J., Isozaki, H., Carreras, X., Collins, M.: An empirical study of semi-supervised structured conditional models for dependency parsing. In: EMNLP (2009)
Tjong Kim Sang, E.F., de Meulder, F.: Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In: CoNLL (2003)
Tomanek, K., Hahn, U.: Semi-supervised active learning for sequence labeling. In: ACL and AFNLP (2009)
Tsuboi, Y., Kashima, H., Mori, S., Oda, H., Matsumoto, Y.: Training conditional random fields using incomplete annotations. In: COLING (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sokolovska, N. (2011). Aspects of Semi-supervised and Active Learning in Conditional Random Fields. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6913. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23808-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-23808-6_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23807-9
Online ISBN: 978-3-642-23808-6
eBook Packages: Computer ScienceComputer Science (R0)