Graph Based Feature Augmentation for Short and Sparse Text Classification

  • Guodong Long
  • Jing Jiang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8346)


Short text classification, such as snippets, search queries, micro-blogs and product reviews, is a challenging task mainly because short texts have insufficient co-occurrence information between words and have a very spare document-term representation. To address this problem, we propose a novel multi-view classification method by combining both the original document-term representation and a new graph based feature representation. Our proposed method uses all documents to construct a neighbour graph by using the shared co-occurrence words. Multi-Dimensional Scaling (MDS) is further applied to extract a low-dimensional feature representation from the graph, which is augmented with the original text features for learning. Experiments on several benchmark datasets show that the proposed multi-view classifier, trained from augmented feature representation, obtains significant performance gain compared to the baseline methods.


Short Text Text Classification Graph Based Method Multi-view Learning Multi-Dimensional Scaling 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Phan, X.H., Nguyen, L.M., Horiguchi, S.: Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proceedings of the 17th International Conference on World Wide Web, pp. 91–100. ACM (2008)Google Scholar
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. The Journal of Machine Learning Research 3, 993–1022 (2003)zbMATHGoogle Scholar
  3. 3.
    Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th International Conference on World Wide Web, pp. 377–386. ACM (2006)Google Scholar
  4. 4.
    Vitale, D., Ferragina, P., Scaiella, U.: Classification of short texts by deploying topical annotations. In: Baeza-Yates, R., de Vries, A.P., Zaragoza, H., Cambazoglu, B.B., Murdock, V., Lempel, R., Silvestri, F. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 376–387. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  5. 5.
    Long, G., Chen, L., Zhu, X., Zhang, C.: Tcsst: transfer classification of short & sparse text using external data. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, pp. 764–772. ACM, New York (2012)Google Scholar
  6. 6.
    Glorot, X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment classification: A deep learning approach. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 513–520 (2011)Google Scholar
  7. 7.
    Hughes, T., Ramage, D.: Lexical semantic relatedness with random graph walks. In: EMNLP-CoNLL, pp. 581–589 (2007)Google Scholar
  8. 8.
    Ramage, D., Rafferty, A.N., Manning, C.D.: Random walks for text semantic similarity. In: Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing, pp. 23–31. Association for Computational Linguistics (2009)Google Scholar
  9. 9.
    Xu, Y., Yi, X., Zhang, C.: A random walks method for text classification. In: SDM (2006)Google Scholar
  10. 10.
    Zhu, X., Lafferty, J., Rosenfeld, R.: Semi-supervised learning with graphs. PhD thesis, Carnegie Mellon University, Language Technologies Institute, School of Computer Science (2005)Google Scholar
  11. 11.
    Goldberg, A.B., Zhu, X.: Seeing stars when there aren’t many stars: graph-based semi-supervised learning for sentiment categorization. In: Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing, pp. 45–52. Association for Computational Linguistics (2006)Google Scholar
  12. 12.
    Borg, I., Groenen, P.J.: Modern multidimensional scaling: Theory and applications. Springer (2005)Google Scholar
  13. 13.
    Tang, L., Liu, H.: Community detection and mining in social media. Synthesis Lectures on Data Mining and Knowledge Discovery 2(1), 1–137 (2010)CrossRefGoogle Scholar
  14. 14.
    Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100. ACM (1998)Google Scholar
  15. 15.
    Christoudias, C., Urtasun, R., Darrell, T.: Multi-view learning in the presence of view disagreement. arXiv preprint arXiv:1206.3242 (2012)Google Scholar
  16. 16.
    Twitter sentiment data,
  17. 17.
    Joachims, T.: Making large scale svm learning practical (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Guodong Long
    • 1
  • Jing Jiang
    • 1
  1. 1.Centre for Quantum Computation & Intelligent SystemsUniversity of TechnologySydneyAustralia

Personalised recommendations