Advertisement

POLAR: Attention-Based CNN for One-Shot Personalized Article Recommendation

  • Zhengxiao DuEmail author
  • Jie Tang
  • Yuhui Ding
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11052)

Abstract

In this paper, we propose POLAR, an attention-based CNN combined with one-shot learning for personalized article recommendation. Given a query, POLAR uses an attention-based CNN to estimate the relevance score between the query and related articles. The attention mechanism can help significantly improve the relevance estimation. For example, on AMiner, this can help achieve a +5.0% improvement in terms of NDCG@3. One more challenge in personalized article recommendation is how to collect statistically sufficient training data for a recommendation model. POLAR combines a one-shot learning function into the recommendation model, which further gains significant improvements. For example, on AMiner, with only  1.6 feedbacks on average, POLAR achieves 2.7% improvement by NDCG@3. We evaluate the proposed POLAR on three different datasets: AMiner, Patent, and RARD. Experimental results demonstrate the effectiveness of the proposed model. Recently, we have successfully deployed POLAR into AMiner as the recommendation engine for article recommendation, which further confirms the effectiveness of the proposed model. Data related to this paper is available at: https://doi.org/10.6084/m9.figshare.7297319.

Keywords

Personalized recommendation Term weighting CNN One-shot learning 

References

  1. 1.
    Beel, J., Carevic, Z., Schaible, J., Neusch, G.: RARD: the related-article recommendation dataset [data] (2017).  https://doi.org/10.7910/DVN/HA8EAH
  2. 2.
    Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: UAI, pp. 43–52 (1998)Google Scholar
  3. 3.
    Caruana, R., Lawrence, S., Giles, L.: Overfitting in neural nets: backpropagation, conjugate gradient, and early stopping. In: NIPS, pp. 381–387 (2000)Google Scholar
  4. 4.
    Chen, K., Wang, J., Chen, L., Gao, H., Xu, W., Nevatia, R.: ABC-CNN: an attention based convolutional neural network for visual question answering. CoRR abs/1511.05960 (2015)Google Scholar
  5. 5.
    Das, A.S., Datar, M., Garg, A., Rajaram, S.: Google news personalization: scalable online collaborative filtering. In: WWW, pp. 271–280 (2007).  https://doi.org/10.1145/1242572.1242610
  6. 6.
    Das, A.: Audio visual person authentication by multiple nearest neighbor classifiers. In: Lee, S.-W., Li, S.Z. (eds.) ICB 2007. LNCS, vol. 4642, pp. 1114–1123. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-74549-5_116CrossRefGoogle Scholar
  7. 7.
    Guo, J., Fan, Y., Ai, Q., Croft, W.B.: A deep relevance matching model for ad-hoc retrieval. In: CIKM, pp. 55–64 (2016).  https://doi.org/10.1145/2983323.2983769
  8. 8.
    Huang, P.S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.: Learning deep structured semantic models for web search using clickthrough data. In: CIKM, pp. 2333–2338 (2013).  https://doi.org/10.1145/2505515.2505665
  9. 9.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML, pp. 448–456 (2015)Google Scholar
  10. 10.
    Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, vol. 2 (2015)Google Scholar
  11. 11.
    Kusner, M.J., Sun, Y., Kolkin, N.I., Weinberger, K.Q.: From word embeddings to document distances. In: ICML, pp. 957–966 (2015)Google Scholar
  12. 12.
    Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)Google Scholar
  13. 13.
    Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998).  https://doi.org/10.1109/5.726791CrossRefGoogle Scholar
  14. 14.
    Li, F., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 594–611 (2006).  https://doi.org/10.1109/TPAMI.2006.79CrossRefGoogle Scholar
  15. 15.
    Li, L., Wang, D., Li, T., Knox, D., Padmanabhan, B.: SCENE: a scalable two-stage personalized news recommendation system. In: SIGIR, pp. 125–134 (2011).  https://doi.org/10.1145/2009916.2009937
  16. 16.
    Marlin, B., Zemel, R.S.: The multiple multiplicative factor model for collaborative filtering. In: ICML, p. 73 (2004).  https://doi.org/10.1145/1015330.1015437
  17. 17.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)Google Scholar
  18. 18.
    Mitra, B., Diaz, F., Craswell, N.: Learning to match using local and distributed representations of text for web search. In: WWW, pp. 1291–1299 (2017).  https://doi.org/10.1145/3038912.3052579
  19. 19.
    Palangi, H., et al.: Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval. IEEE/ACM Trans. Audio Speech Lang. Proc. 24(4), 694–707 (2016).  https://doi.org/10.1109/TASLP.2016.2520371CrossRefGoogle Scholar
  20. 20.
    Pang, L., Lan, Y., Guo, J., Xu, J., Wan, S., Cheng, X.: Text matching as image recognition. In: AAAI, pp. 2793–2799 (2016)Google Scholar
  21. 21.
    Pazzani, M.J., Billsus, D.: Content-based recommendation systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) The Adaptive Web. LNCS, vol. 4321, pp. 325–341. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-72079-9_10CrossRefGoogle Scholar
  22. 22.
    Rendle, S.: Factorization machines. In: ICDM, pp. 995–1000 (2010).  https://doi.org/10.1109/ICDM.2010.127
  23. 23.
    Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M., Gatford, M.: Okapi at TREC-3. In: TREC, pp. 109–126 (1994)Google Scholar
  24. 24.
    Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)CrossRefGoogle Scholar
  25. 25.
    Salton, G., Fox, E.A., Wu, H.: Extended boolean information retrieval. Commun. ACM 26(11), 1022–1036 (1983).  https://doi.org/10.1145/182.358466MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: extraction and mining of academic social networks. In: SIGKDD, pp. 990–998 (2008).  https://doi.org/10.1145/1401890.1402008
  28. 28.
    Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: NIPS, pp. 3630–3638 (2016)Google Scholar
  29. 29.
    Wan, S., Lan, Y., Guo, J., Xu, J., Pang, L., Cheng, X.: A deep architecture for semantic matching with multiple positional sentence representations. In: AAAI, pp. 2835–2841 (2016)Google Scholar
  30. 30.
    Xiao, T., Xu, Y., Yang, K., Zhang, J., Peng, Y., Zhang, Z.: The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: CVPR, pp. 842–850 (2015).  https://doi.org/10.1109/CVPR.2015.7298685
  31. 31.
    Xiong, C., Dai, Z., Callan, J., Liu, Z., Power, R.: End-to-end neural ad-hoc ranking with kernel pooling. In: SIGIR, pp. 55–64 (2017).  https://doi.org/10.1145/3077136.3080809
  32. 32.
    Yin, W., Schütze, H., Xiang, B., Zhou, B.: ABCNN: attention-based convolutional neural network for modeling sentence pairs. TACL 4, 259–272 (2016)Google Scholar
  33. 33.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_53CrossRefGoogle Scholar
  34. 34.
    Zheng, G., Callan, J.: Learning to reweight terms with distributed representations. In: SIGIR, pp. 575–584 (2015).  https://doi.org/10.1145/2766462.2767700

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer Science and TechnologyTsinghua UniversityBeijingChina

Personalised recommendations