Skip to main content

Improving the RACAI Neural Network MSD Tagger

  • Conference paper

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 383))

Abstract

Part-of-speech (POS) tagging is a key process for various natural language processing related tasks, in which each word of a sentence is assigned a uniquely interpretable label (called a POS tag). There are many proposed methodologies for this task, such as Hidden Markov Models, Conditional Random Fields, Maximum Entropy classifiers etc. Such methods are primarily intended for English which, in comparison to highly inflectional languages has a relatively small tagset inventory. One of the well-known methods used for large tagset labeling (referred to as morpho-syntactic descriptors or MSDs) is called Tiered Tagging (Tufiş, 1999), (Tufiş and Dragomirescu, 2006) and it exploits a reduced set of tags from which context irrelevant features (e.g. gender information) which can be deduced trough the word form’s flectional analysis are stripped. In our previous work we presented an alternative method to Tiered Tagging, in which we performed multi-class classification with a feed-forward neural network. Our methodology has the advantage that it does not require extensive linguistic knowledge as implied by the previously mentioned approach. We extend our work by testing our tool on Czech and successfully experimenting with a genetic algorithm designed to find a better network topology.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berger, A.L., Pietra, V.J.D., Pietra, S.A.D.: A maximum entropy approach to natural language processing. Computational Linguistics 22(1), 39–71 (1996)

    Google Scholar 

  2. Boros, T., Ion, R., Tufiş, D.: Large tagset labeling using Feed Forward Neural Networks. Case study on Romanian Language. Accepted for publication in ACL, Sofia, Bulgaria (2013)

    Google Scholar 

  3. Brants, T.: TnT: a statistical part-of-speech tagger. In: Proceedings of the Sixth Conference on Applied Natural Language Processing, pp. 224–231. Association for Computational Linguistics (2000)

    Google Scholar 

  4. Calzolari, N., Monachini, M. (eds.): Common Specifications and Notation for Lexicon Encoding and Preliminary Proposal for the Tagsets. MULTEXT Report (March 1995)

    Google Scholar 

  5. Ceausu, A.: Maximum entropy tiered tagging. In: Proceedings of the 11th ESSLLI Student Session, pp. 173–179 (2006)

    Google Scholar 

  6. Erjavec, T., Monachini, M. (eds.): Specifications and Notation for Lexicon Encoding. Deliverable D1.1 F. Multext-East Project COP-106 (1997)

    Google Scholar 

  7. Fischer, M.M., Leung, Y.: A genetic-algorithms based evolutionary computational neural network for modelling spatial interaction dataNeural network for modelling spatial interaction data. The Annals of Regional Science 32(3), 437–458 (1998)

    Article  Google Scholar 

  8. Fiszelew, A., Britos, P., Ochoa, A., Merlino, H., Fernández, E., García-Martínez, R.: Finding optimal neural network architecture using genetic algorithms. Adv. Comput. Sci. Eng. Res. Comput. Sci. 27 (2007)

    Google Scholar 

  9. Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data (2001)

    Google Scholar 

  10. Marques, N.C., Lopes, G.P.: A neural network approach to part-of-speech tagging. In: Proceedings of the 2nd Meeting for Computational Processing of Spoken and Written Portuguese, pp. 21–22 (1996)

    Google Scholar 

  11. Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, vol. 1, pp. 133–142 (1996)

    Google Scholar 

  12. Samuelsson, C.: Morphological tagging based entirely on Bayesian inference. In: 9th Nordic Conference on Computational Linguistics (June 1993)

    Google Scholar 

  13. Schmid, H.: Part-of-speech tagging with neural networks. In: Proceedings of the 15th Conference on Computational Linguistics, vol. 1, pp. 172–176. Association for Computational Linguistics (August 1994)

    Google Scholar 

  14. Schaffer, J.D., Whitley, D., Eshelman, L.J.: Combinations of genetic algorithms and neural networks: A survey of the state of the art. In: International Workshop on Combinations of Genetic Algorithms and Neural Networks, COGANN 1992, pp. 1–37. IEEE (June 1992)

    Google Scholar 

  15. Tufiş, D., Barbu, A.M., Pătraşcu, V., Rotariu, G., Popescu, C.: Corpora and Corpus-Based Morpho-Lexical Processing. In: Recent Advances in Romanian Language Technology, pp. 35–56. Romanian Academy Publishing House (1997) ISBN 973-27-0626-0

    Google Scholar 

  16. Tufiş, D.: Tiered tagging and combined language models classifiers. In: Matoušek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds.) TSD 1999. LNCS (LNAI), vol. 1692, pp. 28–33. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  17. Tufiş, D., Dragomirescu, L.: Tiered tagging revisited. In: Proceedings of the 4th LREC Conference (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Boroş, T., Dumitrescu, S.D. (2013). Improving the RACAI Neural Network MSD Tagger. In: Iliadis, L., Papadopoulos, H., Jayne, C. (eds) Engineering Applications of Neural Networks. EANN 2013. Communications in Computer and Information Science, vol 383. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41013-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41013-0_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41012-3

  • Online ISBN: 978-3-642-41013-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics