Skip to main content

Questionnaire Free Text Summarisation Using Hierarchical Classification

  • Conference paper
  • First Online:
Research and Development in Intelligent Systems XXIX (SGAI 2012)

Abstract

This paper presents an investigation into the summarisation of the free text element of questionnaire data using hierarchical text classification. The process makes the assumption that text summarisation can be achieved using a classification approach whereby several class labels can be associated with documents which then constitute the summarisation. A hierarchical classification approach is suggested which offers the advantage that different levels of classification can be used and the summarisation customised according to which branch of the tree the current document is located. The approach is evaluated using free text from questionnaires used in the SAVSNET (Small Animal Veterinary Surveillance Network) project. The results demonstrate the viability of using hierarchical classification to generate free text summaries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Afantenos, S. and Karkaletsis, V. and Stamatopoulos, P. (2005). Summarization from medical documents: a survey. Artificial Intelligence in Medicine Vol. 33, pp157-177.

    Article  Google Scholar 

  2. Alonso, L. and Castell’on, I. and Climent, S. and Fuentes, M. and Padr’o, L. and Rodr’ıguez, H (2004). Approaches to text summarization: Questions and answers. Inteligencia Artificial Vol. 8, pp22.

    Article  Google Scholar 

  3. Celikyilmaz, A. and Hakkani-T‥ur, D. (2011). Concept-based classification for multi-document summarization. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, pp5540-5543.

    Google Scholar 

  4. Chuang, W. and Tiyyagura, A. and Yang, J. and Giuffrida, G. (2000). A fast algorithm for hierarchical text classification. Data Warehousing and Knowledge Discovery, pp409-418.

    Google Scholar 

  5. Dhillon, I.S. and Mallela, S. and Kumar, R. (2002). Enhanced word clustering for hierarchical text classification. Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp191-200.

    Google Scholar 

  6. Dumais, S. and Chen, H. (2000). Hierarchical classification of web content. Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, pp256-263.

    Google Scholar 

  7. Duwairi, R. and Al-Zubaidi, R. (2011). A Hierarchical K-NN Classifier for Textual Data. The International Arab Journal of Information Technology. Vol. 8, pp251-259.

    Google Scholar 

  8. Fragoudis, D. and Meretakis, D. and Likothanassis, S. (2005). Best terms: an efficient featureselection algorithm for text categorization. Knowledge and Information Systems. Vol. 8, pp16- 33.

    Article  Google Scholar 

  9. Gao, F. and Fu, W. and Zhong, Y. and Zhao, D. (2004). Large-Scale Hierarchical Text Classification Based on Path Semantic Vector and Prior Information. CIS’09. International Conference on Computational Intelligence and Security. Vol. 1, pp54-58.

    Google Scholar 

  10. Garcia-Constantino, M. F. and Coenen, F. and Noble, P. and Radford, A. and Setzkorn, C. and Tierney, A. (2011). An Investigation Concerning the Generation of Text Summarisation Classifiers using Secondary Data. Seventh International Conference on Machine Learning and Data Mining. Springer, pp387-398.

    Google Scholar 

  11. Garcia-Constantino, M. F. and Coenen, F. and Noble, P. and Radford, A. and Setzkorn, C. (2012). A Semi-Automated Approach to Building Text Summarisation Classifiers. To be presented at the Eight International Conference on Machine Learning and Data Mining. Springer.

    Google Scholar 

  12. Granitzer, M. (2003). Hierarchical text classification using methods from machine learning. Master’s Thesis, Graz University of Technology.

    Google Scholar 

  13. Hand, D.J. and Till, R.J. (2001). A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning, 45, pp171-186.

    Article  MATH  Google Scholar 

  14. Hardy, H. and Shimizu, N. and Strzalkowski, T. and Ting, L. and Zhang, X. and Wise, G.B. (2002). Cross-document summarization by concept classification. Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp121-128.

    Google Scholar 

  15. Jaoua, M. and Hamadou, A. (2003). Automatic text summarization of scientific articles based on classification of extracts population. Computational Linguistics and Intelligent Text Processing, pp363-377.

    Google Scholar 

  16. Jones, K.S. and others. (1999). Automatic summarizing: factors and directions. Advances in automatic text summarization, pp1-12.

    Google Scholar 

  17. Katakis, I. and Tsoumakas, G. and Vlahavas, I. (2008). Multilabel text classification for automated tag suggestion. Proceedings of the ECML/PKDD 2008. Workshop in Discovery Challenge, pp75-83. Antwerp, Belgium.

    Google Scholar 

  18. Koller, D. and Sahami, M. (1997). Hierarchically Classifying Documents Using Very Few Words. Proceedings of the Fourteenth International Conference on Machine Learning, pp170- 178.

    Google Scholar 

  19. Kumilachew, A. (2011). Hierarchical Amharic News Text Classification: Using Support Vector Machine Approach. VDM Verlag Dr. M‥uller.

    Google Scholar 

  20. Platt, J.C. (1999). Using analytic QP and sparseness to speed training of support vector machines. Advances in neural information processing systems, pp557-563.

    Google Scholar 

  21. Pulijala, A. and Gauch, S. (2004). Hierarchical text classification. International Conference on Cybernetics and Information Technologies, Systems and Applications: CITSA, pp21-25.

    Google Scholar 

  22. Qiu, X. and Huang, X. and Liu, Z. and Zhou, J. (2011). Hierarchical Text Classification with Latent Concepts. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Vol. 2, pp598-602.

    Google Scholar 

  23. Radford, A. and Tierney, A’. and Coyne, K.P. and Gaskell, R.M. and Noble, P.J. and Dawson, S. and Setzkorn, C. and Jones, P.H. and Buchan, I.E. and Newton, J.R. and Bryan, J.G.E. (2010). Developing a network for small animal disease surveillance. Veterinary Record. Vol. 167, pp472-474.

    Article  Google Scholar 

  24. Rousu, J. and Saunders, C. and Szedmak, S. and Shawe-Taylor, J. (2005). Learning Hierarchical Multi-Category Text Classification Models. Proceedings of the 22nd International Conference on Machine Learning, pp744-751.

    Chapter  Google Scholar 

  25. Ruiz, M.E. and Srinivasan, P. (2002). Hierarchical text categorization using neural networks. Information Retrieval. Vol. 5, pp87-118.

    Article  MATH  Google Scholar 

  26. Saravanan, M. and Raj, P.C.R. and Raman, S. (2003). Summarization and categorization of text data in high-level data cleaning for information retrieval. Applied Artificial Intelligence, Vol. 17, pp461-474.

    Article  Google Scholar 

  27. Sebastiani, F. (2002). Machine learning in automated text categorization. ACM computing surveys (CSUR). Vol. 34, pp1-47.

    Google Scholar 

  28. Silla, C.N. and Freitas, A.A. (2011). A survey of hierarchical classification across different application domains. Data Mining and Knowledge Discovery Vol. 22, pp31-72.

    Article  MathSciNet  MATH  Google Scholar 

  29. Sun, A. and Lim, E.P. (2001). Hierarchical text classification and evaluation. ICDM 2001, Proceedings IEEE International Conference on Data Mining. IEEE, pp521-528.

    Google Scholar 

  30. Toutanova, K. and Chen, F. and Popat, K. and Hofmann, T. (2001). Text classification in a hierarchical mixture model for small training sets. Proceedings of the tenth international conference on Information and knowledge management, pp105-113.

    Google Scholar 

  31. Willett, P. (2006). The Porter stemming algorithm: then and now. Program: electronic library and information systems Vol. 40, pp219-223.

    Google Scholar 

  32. Zheng, Z. and Wu, X. and Srihari, R. (2004). Feature selection for text categorization on imbalanced data. ACM SIGKDD Explorations Newsletter Vol. 6, pp80-89.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matias Garcia-Constantino .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag London

About this paper

Cite this paper

Garcia-Constantino, M., Coenen, F., Noble, PJ., Radford, A. (2012). Questionnaire Free Text Summarisation Using Hierarchical Classification. In: Bramer, M., Petridis, M. (eds) Research and Development in Intelligent Systems XXIX. SGAI 2012. Springer, London. https://doi.org/10.1007/978-1-4471-4739-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-4739-8_3

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-4738-1

  • Online ISBN: 978-1-4471-4739-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics