Language Resources and Evaluation

, Volume 49, Issue 1, pp 215–226 | Cite as

A novel method for performance evaluation of text chunking

  • Suchismita Maiti
  • Utpal Garain
  • Arnab Dhar
  • Sankar De
Project Notes


Evaluation of text chunking is revisited. The proposed method tries to analyze the errors made by a chunker and formulates an evaluation strategy that brings out the strength and weakness of a chunker in a better way than the existing precision, recall and F score based methods or their variants do. A tree-matching based algorithm of linear time complexity is designed, analyzed, and illustrated by giving examples. Correctness of the algorithm is checked by using a chunker and a set of test sentences.


Text chunking Performance evaluation CoNLL 2000 PARSEVAL Tree matching Indic languages 



The authors sincerely thank the anonymous reviewers of this paper. We also express our gratitude to one of the reviewers who appreciated our work and pointed out its need for revisiting chunking in the context of noisy text (sms, tweet, blog, email, etc.) analysis.


  1. Abney, S., & Abney, S. P. (1991). Parsing by chunk. In: R. C. Berwick, S. P. Abney & C. Tenny (Eds.), Principle-based parsing: Computation and Psycholinguistics. (pp. 257–278). Dordrecht: Kluwer Academic Publishers.Google Scholar
  2. Bharti, A., Sangal, R., & Sharma, D. M. (2007). SSF: Shakti Standard Format Guide. Hyderabad: LTRC, IIIT.Google Scholar
  3. Bharti, A., Sharma, D. M., Husain, S., Bai, L., Begam, R., & Sangal, R. (2009). AnnCorra:TreeBanks for Indian Languages, Guidelines for Annotating Hindi TreeBank v2.0. Hyderabad: LTRC, IIIT.Google Scholar
  4. Biswas, S., Dhar, A., De, S., & Garain, U. (2010). Performance evaluation of text chunking. In Proceedings of the 8th international conference on natural language processing (ICON), Kharagpur, India.Google Scholar
  5. Black, E., Abney, S., Flickenger, D., Gdaniec, C., Grishman, R., Harison, P., Hindle, D., Ingria, R., Jelineck, F., Klavan, J., Liberman, M., Marcus, M., Roukos, S., Santorini, B., & Strzalkozskijl, T. (1991). A procedure for quantitatively comparing the syntactic coverage of english grammars. In Proceedings of the 4th DARPA speech and natural language workshop, Morgan Kaufman, pp. 306–311.Google Scholar
  6. Carroll, J., Briscoe, T., & Sanfilippo, A. (1998). Parser evaluation: A survey and a new proposal. In Proceedings of the 1st international conference language resources and evaluation (LREC), pp. 447–454.Google Scholar
  7. Carroll, J., Frank, A., Lin, D., Prescher, D., & Uszkoreit, H. (2002). Beyond PARSEVAL—towards improved evaluation measures for parsing system. In Proceedings of 3rd international conference Language Resources and Evaluation (LREC).Google Scholar
  8. Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to algorithms (3rd ed.). Cambridge, MA: MIT Press.Google Scholar
  9. De, S., Dhar, A., Biswas, S., & Garain, U. (2011). On development and evaluation of a chunker in Bangla. In Proceedings of 2nd international conference on Emerging Applications of Information Technology (EAIT), pp. 321–324.Google Scholar
  10. Husian, S., Mannem, P., Ambati, B., & Gadde, P. (2010). Proceedings of ICON10 NLP Tools Contest: Indian language dependency parsing. The 8th international conference on natural language processing (ICON), India.Google Scholar
  11. Lin, D. (2003). Dependency-based evaluation of Minipar. In: A. Abeille (Ed.), Treebanks: Building and using parsed corpora (Chap. 18, Vol. 20, pp. 317–329). The Netherlands: Springer.Google Scholar
  12. Manning, C. D., & Schutze, H. (1999). Foundation of statistical natural language processing. Cambridge, MA: MIT Press.Google Scholar
  13. Paroubek, P., Hamon, O., Clergerie, E., Grouin, C., & Vilnat, A. (2010). The second evaluation campaign of PASSAGE on parsing of French. In Proceedings of 7th international conference on language resources and evaluation (LREC), pp. 19–21.Google Scholar
  14. Paroubek, P., Robba, I., Vilnat, A., & Ayache, C. (2008). Easy, evaluation of parsers of French: What are the results? In Proceedings of 6th international conference language resources and evaluation (LREC).Google Scholar
  15. Roark, B. (2002). Evaluating parser accuracy using edit distance. In Proceedings of the beyond PARSEVAL workshop, 3rd international conference language resources and evaluation (LREC), pp. 30–36.Google Scholar
  16. Sakoe, H., & Chiba, S. (1978), Dynamic programming algorithm optimization for spoken word recognition. In IEEE transactions on acoustics. Speech and signal processing, Vol. 2, pp. 43–49.Google Scholar
  17. Sampson, G., & Babarczy, A. (2003). A test of the leaf-ancestor metric for parse accuracy. Journal of Natural Language Engineering, 9, 365–380.CrossRefGoogle Scholar
  18. Sang Tjong Kim, E. F., & Buchholz, S. (2000) Introduction to the CoNLL-2000 shared task: Chunking. In Proceedings of CoNLL-2000 and LLL-2000 (pp. 127–132). Lisbon, Portugal.Google Scholar
  19. Singh, A., Bendre, S. M., & Sangal, R. (2005), HMM based chunker for Hindi. In Proceedings 2nd International Joint Conference on Natural Language Processing (IJCNLP), Jeju Island, Republic of Korea.Google Scholar
  20. Srinivas, B. (2000). A lightweight dependency analyzer for partial parsing. Natural Language Engineering, 6(2), 113–138.CrossRefGoogle Scholar
  21. Srinivas, B., Doran, C., Hockey, B. A., & Joshi, A. (1996). An approach to robust partial parsing and evaluation metrics. In Proceedings of 8th european summer school in logic, language and information, pp. 70–82.Google Scholar
  22. Zhang, K., & Shasha, D. (1989). Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal of Computing, 18, 1245–1262.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Suchismita Maiti
    • 1
  • Utpal Garain
    • 2
  • Arnab Dhar
    • 3
  • Sankar De
    • 4
  1. 1.Department of Information TechnologyNational Institute of Technology (NIT)DurgapurIndia
  2. 2.Indian Statistical InstituteKolkataIndia
  3. 3.Department of Computer Science and EngineeringIndian Institute of Technology (IIT)KharagpurIndia
  4. 4.Gupta College of Technological SciencesAsansolIndia

Personalised recommendations