Advertisement

Identifying intentions in forum posts with cross-domain data

  • Tu Minh PhuongEmail author
  • Le Cong Linh
  • Ngo Xuan Bach
Article
  • 31 Downloads

Abstract

In this paper, we present a method to identify forum posts expressing user intentions in online discussion forums. The results of this task, for example buying intentions, can be exploited for targeted advertising or other marketing tasks. Our method utilizes labeled data from other domains to help the learning task in the target domain by using a Naive Bayes (NB) framework to combine the data statistics . Because the distributions of data vary from domain to domain, it is important to adjust the contributions of different data sources when constructing the learning model, to achieve accurate results. Here, we propose to adjust the parameters of the NB classifier by optimizing an objective, which is equivalent to maximizing the between-class separation, using stochastic gradient descent. Experimental results show that our method outperforms several competitive baselines on a benchmark dataset consisting of forum posts from four domains: Cellphone, Electronics, Camera, and TV. In addition, we explore the possibility of combining NB posteriors computed during the optimization process with another classifier, namely Support Vector Machines. Experimental results show the usefulness of optimized NB class posteriors when using as features for SVMs in the cross-domain settings.

Keywords

Cross-domain learning Domain adaptation Online forums Intention detection Stochastic gradient descent 

Notes

References

  1. Bach, N.X., Phuong, T.M.: Leveraging user ratings for resource-poor sentiment classification. In: Proceedings of the 19th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES), pp. 322–331 (2015)Google Scholar
  2. Bach, N.X., Hai, N.D., Phuong, T.M.: Personalized recommendation of stories for commenting in forum-based social media. Inf. Sci. 352–353, 48–60 (2016a)CrossRefGoogle Scholar
  3. Bach, N.X., Hai, V.T., Phuong, T.M.: Cross-domain sentiment classification with word embeddings and canonical correlation analysis. In: Proceedings of the 7th International Symposium on Information and Communication Technology (SoICT), pp. 159–166 (2016b)Google Scholar
  4. Bach, N.X., Linh, L.C., Phuong, T.M.: Cross-domain intention detection in discussion forums. In: Proceedings of the Eighth International Symposium on Information and Communication Technology (SoICT), pp. 173–180 (2017)Google Scholar
  5. Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 440–447 (2007)Google Scholar
  6. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory (COLT) (1998)Google Scholar
  7. Chang, C., Lin, C.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)CrossRefGoogle Scholar
  8. Chen, Z., Liu, B.: Topic modeling using topics from many domains, lifelong learning and big data. In: Proceedings of the 31st International Conference on Machine Learning (ICML) (2014)Google Scholar
  9. Chen, Z., Liu, B.: Lifelong Machine Learning. Morgan and Claypool, San Rafael (2017)Google Scholar
  10. Chen, Z., Liu, B., Hsu, M., Castellanos, M., Ghosh, R.: Identifying intention posts in discussion forums. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 1041–1050 (2013)Google Scholar
  11. Chen, Z., Ma, N., Liu, B.: Lifelong learning for sentiment classification. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 750–756 (2015)Google Scholar
  12. Ding, X., Liu, T., Duan, J., Nie, J.Y.: Mining user consumption intention from social media using domain adaptive convolutional neural network. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 2389–2395 (2015)Google Scholar
  13. Easley, D., Kleinberg, J.: Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, Cambridge (2010)CrossRefzbMATHGoogle Scholar
  14. Ghani, R.: Using error-correcting codes for text classification. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pp. 303–310 (2000)Google Scholar
  15. Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan, J., Smith, N.: Part-of-speech tagging for twitter: annotation, features, and experiments. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 42–47 (2011)Google Scholar
  16. Hamrouna, M., Gouider, M.S., Said, L.B.: Large scale microblogging intentions analysis with pattern based approach. In: Proceedings of International Conference on Knowledge Based and Intelligent Information and Engineering Systems (KES), pp. 1249–1257 (2016)Google Scholar
  17. Hollerit, B., Kroll, M., Strohmaier, M.: Towards linking buyers and sellers: detecting commercial intent on twitter. In: Proceedings of the World Wide Web Conference (WWW), pp. 629–632 (2013)Google Scholar
  18. Jiang, J.: A literature survey on domain adaptation of statistical classifiers. Technical report, University of Illinois Urbana-Champaign (2008)Google Scholar
  19. Li, Q., Wang, J., Chen, Y., Lin, Z.: User comments for news recommendation in forum-based social media. Inf. Sci. 180(24), 4929–4939 (2010)CrossRefGoogle Scholar
  20. Li, L., Wang, D., Li, T., Knox, D., Padmanabhan, B.: Scene: a scalable two-stage personalized news recommendation system. In: Proceedings of the Thirty-Fourth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 125–134 (2011)Google Scholar
  21. Li, C.X., Du, Y.J., Liu, J., Zheng, H., Wang, S.D.: A novel approach of identifying user intents in microblog. In: Proceedings of International Conference on Intelligent Computing (ICIC), pp. 391–400 (2016)Google Scholar
  22. Liu, B.: Sentiment Analysis and Opinion Mining: Synthesis Lectures on Human Languages Technologies. Morgan and Claypool, San Rafael (2012)CrossRefGoogle Scholar
  23. Luong, T.L., Tran, T.H., Truong, Q.T., Truong, T.M.N., Phi, T.T., Phan, X.H.: Learning to filter user explicit intents in online Vietnamese social media texts. In: Proceedings of the Asian Conference on Intelligent Information and Database Systems (ACIIDS), pp. 13–24 (2016)Google Scholar
  24. Nakov, P., Ritter, A., Rosenthal, S., Sebastiani, F., Stoyanov, V.: Semeval-2016 task 4: sentiment analysis in twitter. In: Proceedings of SemEval-2016, pp. 1–18 (2016)Google Scholar
  25. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  26. Pan, S.J., Tsang, I.W., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. 22(2), 199–210 (2011)CrossRefGoogle Scholar
  27. Ritter, A., Clark, S., Mausam, Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534 (2011)Google Scholar
  28. Wang, S., Manning, C.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL): Short Papers, vol. 2, pp. 90–94 (2012)Google Scholar
  29. Wang, J., Cong, G., Zhao, W.X., Li, X.: Mining user intents in twitter: a semi-supervised approach to inferring intent categories for tweets. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 339–345 (2015)Google Scholar
  30. Zhu, X.: Semi-supervised learning literature survey. Technical report, University of Wisconsin-Madison (2008)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Tu Minh Phuong
    • 1
    Email author
  • Le Cong Linh
    • 2
  • Ngo Xuan Bach
    • 1
  1. 1.Department of Computer Science and Machine Learning and Applications LabPosts and Telecommunications Institute of TechnologyHanoiVietnam
  2. 2.FPT Software Research LabHanoiVietnam

Personalised recommendations