Skip to main content

SuperSense Tagging with a Maximum Entropy Markov Model

  • Conference paper
Evaluation of Natural Language and Speech Tools for Italian (EVALITA 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7689))

  • 693 Accesses

Abstract

We tackled the task of SuperSense tagging by means of the Tanl Tagger, a generic, flexible and customizable sequence labeler, developed as part of the Tanl linguistic pipeline. The tagger can be configured to use different classifiers and to extract features according to feature templates expressed through patterns, so that it can be adapted to different tagging tasks, including PoS and Named Entity tagging. The tagger operates in a Markov chain, using a statistical classifier to infer state transitions and dynamic programming to select the best overall sequence of tags. We exploited the extensive customization capabilities of the tagger in order to tune it for the task of SuperSense tagging, by performing an extensive process of feature selection. The resulting configuration achieved the best scores in the closed subtask.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Attardi, G., Dei Rossi, S., Di Pietro, G., Lenci, A., Montemagni, S., Simi, M.: A Resource and Tool for SuperSense Tagging of Italian Texts. In: Proceedings of 7th Language Resources and Evaluation Conference (LREC 2010), Malta, pp. 17–23 (2010)

    Google Scholar 

  2. Attardi, G., Dei Rossi, S., Simi, M.: The Tanl Pipeline. In: Proceedings of Workshop on Web Services and Processing Pipelines in HLT, Malta (2010)

    Google Scholar 

  3. Basile, P.: Super-Sense Tagging using support vector machines and distributional features. In: Working Notes of Evalita 2011, Rome, Italy (January 2012) ISSN 2240-5186

    Google Scholar 

  4. Chieu, H.L., Ng, H.T.: Named Entity Recognition with a Maximum Entropy Approach. In: Proceedings of CoNLL 2003, Edmonton, Canada, pp. 160–163 (2003)

    Google Scholar 

  5. Ciaramita, M., Altun, Y.: Broad-Coverage Sense Disambiguation and Information Extraction with a Supersense Sequence Tagger. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing EMNLP, pp. 594–602 (2006)

    Google Scholar 

  6. Darroch, J.N., Ratcliff, D.: Generalized Iterative Scaling for Log-Linear Models. Annals of Mathematical Statistics 43(5), 1470–1480 (1972)

    Article  MathSciNet  MATH  Google Scholar 

  7. Della Pietra, S., Della Pietra, V., Lafferty, J.: Inducing Features of Random Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(4), 380–393 (1997)

    Article  Google Scholar 

  8. Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  9. Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization methods. Mathematical Programming 45, 503–528 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  10. McCallum, A., Freitag, D., Pereira, F.: Maximum Entropy Markov Models for Information Extraction and Segmentation. In: Proc. ICML 2000, pp. 591–598 (2001)

    Google Scholar 

  11. Montemagni, S., et al.: Building the Italian Syntactic-Semantic Treebank. In: Abeillé (ed.) Building and using Parsed Corpora, Language and Speech Series, pp. 189–210. Kluwer, Dordrecht (2003)

    Google Scholar 

  12. Picca, D., Gliozzo, A., Ciaramita, M.: SuperSense Tagger for Italian. In: Proceedings of LREC 2008, Marrakech (2008)

    Google Scholar 

  13. Roventini, A., Alonge, A., Calzolari, N., Magnini, B., Bertagna, F.: ItalWordNet: a Large Semantic Database for Italian. In: Proceedings of LREC 2000, Athens (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Attardi, G., Baronti, L., Dei Rossi, S., Simi, M. (2013). SuperSense Tagging with a Maximum Entropy Markov Model. In: Magnini, B., Cutugno, F., Falcone, M., Pianta, E. (eds) Evaluation of Natural Language and Speech Tools for Italian. EVALITA 2012. Lecture Notes in Computer Science(), vol 7689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35828-9_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35828-9_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35827-2

  • Online ISBN: 978-3-642-35828-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics