Abstract
Conditional Random Fields (CRFs) have received a great amount of attentions in many fields and achieved good results. However, a case frequently encountered in practice is that the test data’s domain is different with the training data’s. It would affect negatively the performance of CRFs. This paper presents a novel technique for maximum a posteriori (MAP) adaptation of Conditional Random Fields model. The background model, which is trained on data from a domain, could be well adapted to a new domain with a small number of labeled domain specific data. Experimental results on tasks of chunking and capitalizing show that this technique can significantly improve performance on out-of-domain data. In chunking task, the relative improvement given by the adaptation technique is 56.9%. With two in-domain sentences, it also can achieve 30.2% relative improvement.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. 18th International Conf. on Machine Learning (2001)
Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: Proceedings of Human Language Technology-NAACL 2003 (2003)
Carreras, X., Márquez, L., Padró, L.: Learning a Perceptron-Based Named Entity Chunker via Online Recognition Feedback. In: Association with HLT-NAACL 2003 (2003)
Okanohara, D., Miyao, Y., Tsuruoka, Y., Tsujii, J.: Improving the scalability of semi-markov conditional random fields for named entity recognition. In: Proceedings of COLING-ACL 2006 (2006)
Settles, B.: Biomedical named entity recognition using conditional random fields and rich feature sets. In: COLING 2004 International Joint workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP 2004) (2004)
Peng, F., Feng, F., McCallum, A.: Chinese Segmentation and New Word Detection using Conditional Random Fields. In: Proceedings of COLING 2004 (2004)
Feng, Y., Sun, L., Lv, Y.: Chinese word segmentation and named entity recognition based on conditional random fields models. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing (2006)
Peng, F., McCallum, A.: Accurate information extraction from research papers using conditional random fields. In: HLT-NAACL 2004: Main Proceedings (2004)
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of english: The penn treebank. Computational Linguistics 19, 313–330 (1993)
Clarkson, P., Robinson, A.J.: Language model adaptation using mixtures and an exponentially decaying cache. In: Proc. ICASSP 1997 (1997)
Chelba, C., Acero, A.: Adaptation of maximum entropy capitalizer: Little data can help a lot. In: Proceedings of EMNLP 2004 (2004)
Leggetter, C., Woodland, P.: Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models. Journal of Computer Speech and Language (1995)
McClosky, D., Charniak, E., Johnson, M.: Reranking and self-training for parser adaptation. In: Proceedings of COLING-ACL 2006 (2006)
Lease, M., Charniak, E.: Parsing biomedical literature. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, Springer, Heidelberg (2005)
Schapire, R.E., Rochery, M., Rahim, M.G., Gupta, N.: Incorporating prior knowledge into boosting. In: Proceedings of the ICML 2002 (2002)
Liu, B., Li, X., Lee, W.S., Yu, P.S.: Text classification by labeling words. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence (2005)
Wu, X., Srihari, R.: Incorporating prior knowledge with weighted margin support vector machines. In: Proceedings of the tenth ACM SIGKDD (2004)
Daumé III, H., Marcu, D.: Domain Adaptation for Statistical Classifiers. Journal of Artificial Intelligence Research (2006)
Chen, S.F., Rosenfeld, R.: A gaussian prior for smoothing. maximum entropy models. Technical Report CMU-CS-99-108 (1999)
Della-Pietra, S., Della-Pietra, V., Lafferty, J.: Inducing features of random fields. IEEE Transactions on PAMI (1997)
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Mathematical Programming
Harman, D., Liberman, M.: Tipster complete. In: Linguistic Data Consortium catalog number LDC93T3A and ISBN: 1-58563-020-9 (1993), http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93T3A
Ramshaw, L., Marcus, M.: Text chunking using transformation-based learning. In: Proceedings of the Third Workshop on Very Large Corpora, Somerset, New Jersey, pp. 82–94 (1995)
Sang, E.F.T.K., Veenstra, J.: Representing Text Chunks
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, Q., Qiu, X., Huang, X., Wu, L. (2008). Domain Adaptation for Conditional Random Fields. In: Li, H., Liu, T., Ma, WY., Sakai, T., Wong, KF., Zhou, G. (eds) Information Retrieval Technology. AIRS 2008. Lecture Notes in Computer Science, vol 4993. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68636-1_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-68636-1_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68633-0
Online ISBN: 978-3-540-68636-1
eBook Packages: Computer ScienceComputer Science (R0)