An LDA Topic Model Adaptation for Context-Based Image Retrieval

  • Hatem AouadiEmail author
  • Mouna Torjmen Khemakhem
  • Maher Ben Jemaa
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 239)


In the context-based image retrieval, the textual information surrounding the image plays a central role for ranking returned results. Although this technique outperforms content-based approaches, it may fail when the query keywords does not match the textual content of many documents containing relevant images. In addition, users are usually not experts and provide ambiguous queries that lead to heterogeneous results. To solve these problems, researchers are trying to re-rank primary results using other techniques such as query expansion, concept-based retrieval, etc. In this paper, we propose to use LDA topic model to re-rank results and therefore improve retrieval precision. We apply this model in two levels: global level represented by the whole document containing the image and local level represented by the paragraph containing an image (considered as a specific textual information for the image). Results show a significant improvement over the standard text retrieval approach by re-ranking with the LDA model applied to the local level.


Image retrieval Topic model Re-ranking LDA 


  1. 1.
    Arora, S., Ge, R., Moitra A.: Learning topic models - Going beyond SVD. In: IEEE 53rd Annual Symposium on Foundations of Computer Science, pp. 1–10 (2012)Google Scholar
  2. 2.
    Barnard, K., Duygulu, P., Forsyth, D.A., de Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. J. Mach. Learn. Res. 3, 1107–1135 (2003)zbMATHGoogle Scholar
  3. 3.
    Blei, D.M., Jordan, M.I.: Modeling annotated data. SIGIR 2003: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 127–134. ACM (2003)Google Scholar
  4. 4.
    Blei, D.M., Ng, A.Y., Jordan, M.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  5. 5.
    Chaney, A.J.B., Blei, D.M.: Visualizing topic models. In: International AAAI Conference on Social Media and Weblogs (2012)Google Scholar
  6. 6.
    Cheng, D., He, X., Liu, Y.: Analyzing the Number of Latent Topics via Spectral Decomposition. arXiv preprint arXiv:1410.6466 (2014)
  7. 7.
    El Demerdash, O., Kosseim, L., Bergler, S.: Image retrieval by inter-media fusion and pseudo-relevance feedback. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 605–611. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  8. 8.
    Griffiths, T., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. U.S.A. 101, 5228–5235 (2004)CrossRefGoogle Scholar
  9. 9.
    Gulati, P., Sharma, A.K.: Ontology Driven Query Expansion for Better Image Retrieval. Int. J. Comput. Appl. 5(10), 33–37 (2010)Google Scholar
  10. 10.
    Harashima, J., Kurohashi, S.: Relevance feedback using latent information. In: Proceedings of the 5th International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, pp. 1037–1045 (2011)Google Scholar
  11. 11.
    Hoffman, M., Blei, D., Cook, P.: Content-based musical similarity computation using the hierarchical Dirichlet process. In: ISMIR 2008–9th International Conference on Music Information Retrieval, pp. 349–354 (2008)Google Scholar
  12. 12.
    Hong, L., Davison, B.D.: Empirical study of topic modeling in twitter. In: Proceedings of the First Workshop on Social Media Analytics, pp. 80–88. ACM (2010)Google Scholar
  13. 13.
    Hörster, E., Lienhart, R., Slaney, M.: Image retrieval on large-scale image databases. In: CIVR 2007: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 17–24. ACM (2007)Google Scholar
  14. 14.
    Juan, C., Jintao, L., Yongdong, Z., Sheng, T.: LDA-based retrieval framework for semantic news video retrieval. In: International Conference on Semantic Computing. ICSC, IEEE Computer Society, pp. 155–160 (2007)Google Scholar
  15. 15.
    Kullback, S., Leibler, R.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Leung, C.H., Li, Y.: Comparison of different ontology-based query expansion algorithms for effective image retrieval. In: Kim, T.-H., Adeli, H., Ramos, C., Kang, B.-H. (eds.) Signal Processing. Image Processing and Pattern Recognition. Springer, Heidelberg (2011)Google Scholar
  17. 17.
    Lu, C., Hu, X., Chen, X., Park, J., He, T., Li, Z.: Probabilistic models for topic learning from images and captions in online biomedical literatures. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 495–504 (2009)Google Scholar
  18. 18.
    Maillot, N., Chevallet, J.-P., Valea, V., Lim, J. H.: IPAL Inter-Media Pseudo-Relevance Feedback Approach to ImageCLEF 2006 Photo Retrieval. Working Notes for the CLEF 2006 Workshop (2006)Google Scholar
  19. 19.
    Navigli, R., Ponzetto, S.P.: BabelNet : Building a very large multilingual semantic network. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, Sweden, pp. 216–225 (2010)Google Scholar
  20. 20.
    Nguyen, C.T., Kaothanthong, N., Phan, X.H., Tokuyama, T.: A feature-word-topic model for image annotation. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management. ACM, pp. 1481–1484 (2010)Google Scholar
  21. 21.
    Putthividhya, D., Attias, H.T., Nagarajan, S.S.: Supervised topic model for automatic image annotation. In: 2010 IEEE International Conference on Acoustics Speech and Signal Processing, pp. 1894–1897. IEEE (2010)Google Scholar
  22. 22.
    Serizawa, M., Kobayashi, I.: A study on query expansion based on topic distributions of retrieved documents. In: Gelbukh, A. (ed.) CICLing 2013, Part II. LNCS, vol. 7817, pp. 369–379. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  23. 23.
    Tang, S., Zheng, Y., Cao, G., Zhang, Y., Li, J.: Ensemble Learning with LDA Topic Models for Visual Concept Detection. In: Multimedia - A Multidisciplinary Approach to Complex, Issues, pp. 175–200 (2012)Google Scholar
  24. 24.
    Teh, Y.W., Newman, D., Welling, M.: A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. In: Advances in Neural Information Processing systems, pp. 1353–1360 (2006)Google Scholar
  25. 25.
    Troelsgård, R., Jensen, B.S., Hansen, L.K.: A Topic Model Approach to Multi-Modal Similarity. CoRR (2014)Google Scholar
  26. 26.
    Ullah, R., Jaafar, J.: Exploiting short query expansion for images retrieval. International Conference on Computer & Information Science (ICCIS), vol. 1, pp. 352–356. IEEE(2012)Google Scholar
  27. 27.
    Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: Proceedings of the 29th Annual international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 178–185. ACM (2006)Google Scholar
  28. 28.
    Ye, Z., Huang, X., Lin, H.: Finding a good query-related topic for boosting pseudo-relevance feedback. J. Am. Soc. Inf. Sci. Technol. Arch. 62(4), 748–760 (2011)CrossRefGoogle Scholar
  29. 29.
    Yi, X., Allan, J.: Evaluating topic models for information retrieval. In: Proceedings of the 17th ACM conference on Information and Knowledge management, pp. 1431–1432. ACM (2008)Google Scholar
  30. 30.
    Zhang, M., Luo, C.: A new ranking method based on latent dirichlet allocation. J. Comput. Inf. Syst. 8(24), 10141–10148 (2012)Google Scholar
  31. 31.
    Zhou, D., Wade, V.: Latent document re-ranking. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 3, pp. 1571–1580. Association for Computational Linguistics (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Hatem Aouadi
    • 1
    Email author
  • Mouna Torjmen Khemakhem
    • 1
  • Maher Ben Jemaa
    • 1
  1. 1.ReDCAD Laboratory, National School of Engineers of SfaxUniversity of SfaxSfaxTunisia

Personalised recommendations