Joint Text Mining with Heterogeneous Data



In Web and social media networks, the text documents are often associated with nodes. For example, the Web can be a viewed as a graph in which each node contains a Web page and also connects to other nodes via hyperlinks. Similarly, a social network is a friendship graph of user-to-user linkages in which each node contains the textual posting activity of the user.


  1. [2]
    C. Aggarwal. Data mining: The textbook. Springer, 2015.Google Scholar
  2. [3]
    C. Aggarwal. Recommender systems: The textbook. Springer, 2016.Google Scholar
  3. [7]
    C. Aggarwal and N. Li. On node classification in dynamic content-based networks. SDM Conference, pp. 355–366, 2011.Google Scholar
  4. [11]
    C. Aggarwal, Y. Xie, and P. Yu. On Dynamic Link Inference in Heterogeneous Networks. SDM Conference, pp. 415–426, 2012.CrossRefGoogle Scholar
  5. [14]
    C. Aggarwal, and C. Zhai, Mining text data. Springer, 2012.Google Scholar
  6. [16]
    C. Aggarwal, Y. Zhao, and P. Yu. On the use of side information for mining text data. IEEE Transactions on Knowledge and Data Engineering, 26(6), pp. 1415–1429, 2014.CrossRefGoogle Scholar
  7. [34]
    L. Ballesteros and W. B. Croft. Dictionary methods for cross-lingual information retrieval. International Conference on Database and Expert Systems Applications, pp. 791–801, 1996.Google Scholar
  8. [42]
    I. Bayer. Fastfm: a library for factorization machines. arXiv preprint arXiv:1505.00641, 2015.
  9. [81]
    S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. ACM SIGMOD Conference, pp. 307–318, 1998.CrossRefGoogle Scholar
  10. [125]
    W. Dai, Y. Chen, G. Xue, Q. Yang, and Y. Yu. Translated learning: Transfer learning across different feature spaces. NIPS Conference, pp. 353–360, 2008.Google Scholar
  11. [130]
    H. Deng, B. Zhao, J. Han. Collective topic modeling for heterogeneous networks. ACM SIGIR Conference, pp. 1109-1110, 2011.Google Scholar
  12. [131]
    H. Deng, J. Han, B. Zhao, Y. Yu, and C. Lin. Probabilistic topic models with biased propagation on heterogeneous information networks. ACM KDD Conference, pp. 1271–1279, 2011.Google Scholar
  13. [172]
    C. Freudenthaler, L. Schmidt-Thieme, and S. Rendle. Factorization machines: Factorized polynomial regression models. German-Polish Symposium on Data Analysis and Its Applications (GPSDAA), 2011.
  14. [186]
    L. Getoor, N. Friedman, D. Koller, and B. Taskar. Learning probabilistic models of link structure. Journal of Machine Learning Research, 3, pp. 679–707, 2002.MathSciNetzbMATHGoogle Scholar
  15. [292]
    D. Liben-Nowell, and J. Kleinberg. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7), pp. 1019–1031, 2007.CrossRefGoogle Scholar
  16. [332]
    Q. Mei, D. Cai, D. Zhang, and C. Zhai. Topic modeling with network regularization. World Wide Web Conference, pp. 101–110, 2008.Google Scholar
  17. [336]
    A. K. Menon, and C. Elkan. Link prediction via matrix factorization. Machine Learning and Knowledge Discovery in Databases, pp. 437–452, 2011.CrossRefGoogle Scholar
  18. [338]
    L. Michelbacher, F. Laws, B. Dorow, U. Heid, and H. Schütze. Building a cross-lingual relatedness thesaurus using a graph similarity measure. LREC, 2010.Google Scholar
  19. [391]
    G. Qi, C. Aggarwal, and T. Huang. Towards semantic knowledge propagation from text corpus to web images. WWW Conference, pp. 297–306, 2011.Google Scholar
  20. [392]
    G. Qi, C. Aggarwal, and T. Huang. Community detection with edge content in social media networks. ICDE Conference, pp. 534–545, 2012.Google Scholar
  21. [403]
    S. Rendle. Factorization machines. IEEE ICDM Conference, pp. 995–100, 2010.Google Scholar
  22. [404]
    S. Rendle. Factorization machines with libfm. ACM Transactions on Intelligent Systems and Technology, 3(3), 57, 2012.CrossRefGoogle Scholar
  23. [440]
    P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad. Collective classification in network data. AI magazine, 29(3), pp. 93, 2008.CrossRefGoogle Scholar
  24. [448]
    A. Singh and G. Gordon. A unified view of matrix factorization models. Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 358–373, 2008.Google Scholar
  25. [462]
    Y. Sun, J. Han, J. Gao, and Y. Yu. itopicmodel: Information network-integrated topic modeling. IEEE ICDM Conference, pp. 493–502, 2011.Google Scholar
  26. [475]
    M. Tsai, C. Aggarwal, and T. Huang. Ranking in heterogeneous social media. WSDM Conference, pp. 613–622, 2014.Google Scholar
  27. [484]
    A. Vinokourov, N. Cristianini, and J. Shawe-Taylor. Inferring a semantic representation of text via cross-language correlation analysis. NIPS Conference, pp. 1473–1480, 2002.Google Scholar
  28. [490]
    H. Wang, H. Huang, F. Nie, and C. Ding. Cross-language Web page classification via dual knowledge transfer using nonnegative matrix tri-factorization. ACM SIGIR Conference, pp. 933–942, 2011.Google Scholar
  29. [512]
    J. Yang, J. McAuley, and J. Leskovec. Community detection in networks with node attributes. IEEE ICDM Conference, pp. 1151–1156, 2013.Google Scholar
  30. [513]
    Q. Yang, Q., Y. Chen, G. Xue, W. Dai, and T. Yu. Heterogeneous transfer learning for image clustering via the social web. Joint Conference of the ACL and Natural Language Processing of the AFNLP, pp. 1–9, 2009.Google Scholar
  31. [514]
    T. Yang, R. Jin, Y. Chi, and S. Zhu. Combining link and content for community detection: a discriminative approach. ACM KDD Conference, pp. 927–936, 2009.Google Scholar
  32. [538]
    Y. Zhou, H. Cheng, and J. X. Yu. Graph clustering based on structural/attribute similarities. Proceedings of the VLDB Endowment, 2(1), pp. 718–729, 2009.CrossRefGoogle Scholar
  33. [540]
    Y. Zhu, Y. Chen, Z. Lu, S. J. Pan, G. Xue, Y. Yu, and Q. Yang. Heterogeneous transfer learning for image classification. AAAI Conference, 2011.Google Scholar
  34. [550]
  35. [553]
  36. [581]

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.IBM T. J. Watson Research CenterYorktown HeightsUSA

Personalised recommendations