Abstract
We tackled the task of SuperSense tagging by means of the Tanl Tagger, a generic, flexible and customizable sequence labeler, developed as part of the Tanl linguistic pipeline. The tagger can be configured to use different classifiers and to extract features according to feature templates expressed through patterns, so that it can be adapted to different tagging tasks, including PoS and Named Entity tagging. The tagger operates in a Markov chain, using a statistical classifier to infer state transitions and dynamic programming to select the best overall sequence of tags. We exploited the extensive customization capabilities of the tagger in order to tune it for the task of SuperSense tagging, by performing an extensive process of feature selection. The resulting configuration achieved the best scores in the closed subtask.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Attardi, G., Dei Rossi, S., Di Pietro, G., Lenci, A., Montemagni, S., Simi, M.: A Resource and Tool for SuperSense Tagging of Italian Texts. In: Proceedings of 7th Language Resources and Evaluation Conference (LREC 2010), Malta, pp. 17–23 (2010)
Attardi, G., Dei Rossi, S., Simi, M.: The Tanl Pipeline. In: Proceedings of Workshop on Web Services and Processing Pipelines in HLT, Malta (2010)
Basile, P.: Super-Sense Tagging using support vector machines and distributional features. In: Working Notes of Evalita 2011, Rome, Italy (January 2012) ISSN 2240-5186
Chieu, H.L., Ng, H.T.: Named Entity Recognition with a Maximum Entropy Approach. In: Proceedings of CoNLL 2003, Edmonton, Canada, pp. 160–163 (2003)
Ciaramita, M., Altun, Y.: Broad-Coverage Sense Disambiguation and Information Extraction with a Supersense Sequence Tagger. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing EMNLP, pp. 594–602 (2006)
Darroch, J.N., Ratcliff, D.: Generalized Iterative Scaling for Log-Linear Models. Annals of Mathematical Statistics 43(5), 1470–1480 (1972)
Della Pietra, S., Della Pietra, V., Lafferty, J.: Inducing Features of Random Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(4), 380–393 (1997)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization methods. Mathematical Programming 45, 503–528 (1989)
McCallum, A., Freitag, D., Pereira, F.: Maximum Entropy Markov Models for Information Extraction and Segmentation. In: Proc. ICML 2000, pp. 591–598 (2001)
Montemagni, S., et al.: Building the Italian Syntactic-Semantic Treebank. In: Abeillé (ed.) Building and using Parsed Corpora, Language and Speech Series, pp. 189–210. Kluwer, Dordrecht (2003)
Picca, D., Gliozzo, A., Ciaramita, M.: SuperSense Tagger for Italian. In: Proceedings of LREC 2008, Marrakech (2008)
Roventini, A., Alonge, A., Calzolari, N., Magnini, B., Bertagna, F.: ItalWordNet: a Large Semantic Database for Italian. In: Proceedings of LREC 2000, Athens (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Attardi, G., Baronti, L., Dei Rossi, S., Simi, M. (2013). SuperSense Tagging with a Maximum Entropy Markov Model. In: Magnini, B., Cutugno, F., Falcone, M., Pianta, E. (eds) Evaluation of Natural Language and Speech Tools for Italian. EVALITA 2012. Lecture Notes in Computer Science(), vol 7689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35828-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-35828-9_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35827-2
Online ISBN: 978-3-642-35828-9
eBook Packages: Computer ScienceComputer Science (R0)