Skip to main content

Toward Optimized Multimodal Concept Indexing

  • Chapter
  • First Online:
Transactions on Computational Collective Intelligence XXVI

Part of the book series: Lecture Notes in Computer Science ((TCCI,volume 10190))

Abstract

Information retrieval on the (social) web moves from a pure term-frequency-based approach to an enhanced method that includes conceptual multimodal features on a semantic level. In this paper, we present an approach for semantic-based keyword search and focus especially on its optimization to scale it to real-world sized collections in the social media domain. Furthermore, we present a faceted indexing framework and architecture that relates content to semantic concepts to be indexed and searched semantically. We study the use of textual concepts in a social media domain and observe a significant improvement from using a concept-based solution for keyword searching. We address the problem of time-complexity that is a critical issue for concept-based methods by focusing on optimization to enable larger and more real-world style applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://en.wikipedia.org/wiki/List_of_social_networking_websites.

  2. 2.

    http://lucene.apache.org/core.

  3. 3.

    https://code.google.com/p/word2vec/.

  4. 4.

    https://code.google.com/p/semanticvectors/.

  5. 5.

     http://scikit-learn.org/stable/.

References

  1. Agirrea, E., Baneab, C., Cardiec, C., Cerd, D., Diabe, M., Gonzalez-Agirrea, A., Guof, W., Mihalceab, R., Rigaua, G., Wiebeg, J.: Semeval-2014 task 10: multilingual semantic textual similarity. In: SemEval (2014)

    Google Scholar 

  2. Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An optimal algorithm for approximate nearest neighbor searching fixed dimensions. J. ACM (JACM) 45(6), 891–923 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  3. Baroni, M., Dinu, G., Kruszewski, G.: Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 238–247 (2014)

    Google Scholar 

  4. Clinchant, S., Ah-Pine, J., Csurka, G.: Semantic combination of textual and visual information in multimedia retrieval. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval (2011)

    Google Scholar 

  5. Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision (at ECCV) (2004)

    Google Scholar 

  6. Dang, V., Bendersky, M., Croft, W.: Two-stage learning to rank for information retrieval. In: Proceedings of European Conference on Information Retrieval (2013)

    Google Scholar 

  7. Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. (JASIS) 41, 391 (1990)

    Article  Google Scholar 

  8. Depeursinge, A., Müller, H.: Fusion techniques for combining textual and visual information retrieval. In: Müller, H., Clough, P., Deselaers, T., Caputo, B. (eds.) ImageCLEF, vol. 32, pp. 95–114. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  9. Eskevich, M., Jones, G.J., Aly, R., et al.: Multimedia information seeking through search and hyperlinking. In: Proceedings of the Annual ACM International Conference on Multimedia Retrieval (2013)

    Google Scholar 

  10. Ionescu, B., Popescu, A., Lupu, M., Gînsca, A.L., Boteanu, B., Müller, H.: Div150cred: a social image retrieval result diversification with user tagging credibility dataset. In: ACM Multimedia Systems Conference Series (2015)

    Google Scholar 

  11. Ionescu, B., Radu, A.-L., Menéndez, M., Müller, H., Popescu, A., Loni, B.: Div400: a social image retrieval result diversification dataset. In: Proceedings of ACM Multimedia Systems Conference Series (2014)

    Google Scholar 

  12. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675–678. ACM (2014)

    Google Scholar 

  13. Jurgens, D., Pilehvar, M.T., Navigli, R.: Semeval-2014 task 3: cross-level semantic similarity. In: SemEval 2014, p. 17 (2014)

    Google Scholar 

  14. Liu, C., Wang, Y.-M.: On the connections between explicit semantic analysis and latent semantic analysis. In: Proceedings of Conference on Information and Knowledge Management, New York, USA (2012)

    Google Scholar 

  15. Liu, N., Dellandréa, E., Chen, L., Zhu, C., Zhang, Y., Bichot, C.-E., Bres, S., Tellez, B.: Multimodal recognition of visual concepts using histograms of textual concepts and selective weighted late fusion scheme. Computer Vision and Image Underst. 117, 493–512 (2013)

    Article  Google Scholar 

  16. Magalhaes, J., Rüger, S.: Information-theoretic semantic multimedia indexing. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, pp. 619–626. ACM (2007)

    Google Scholar 

  17. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute (2011)

    Google Scholar 

  18. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint 2013 arXiv:1301.3781

  19. Paramita, M.L., Grubinger, M.: Photographic image retrieval. In: Müller, H., Clough, P., Deselaers, T., Caputo, B. (eds.) ImageCLEF, vol. 32, pp. 141–162. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  20. Pham, T.-T., Maillot, N., Lim, J.-H., Chevallet, J.-P.: Latent semantic fusion model for image retrieval and annotation. In: Proceedings of Conference on Information and Knowledge Management (2007)

    Google Scholar 

  21. Rekabsaz, N., Bierig, R., Ionescu, B., Hanbury, A., Lupu, M.: On the use of statistical semantics for metadata-based social image retrieval. In: Proceedings of the 13th International Workshop on Content-Based Multimedia Indexing (CBMI) (2015)

    Google Scholar 

  22. Sabetghadam, S., Lupu, M., Bierig, R., Rauber, A.: A combined approach of structured and non-structured IR in multimodal domain. In: Proceedings of ACM International Conference on Multimedia Retrieval (2014)

    Google Scholar 

  23. Sahlgren, M.: An introduction to random indexing. In: Methods and Applications of Semantic Indexing Workshop in the Proceedings of Terminology and Knowledge Engineering (2005)

    Google Scholar 

  24. Thomee, B., Popescu, A.: Overview of the ImageCLEF 2012 Flickr photo annotation and retrieval task. In: Proceedings of Cross-Language Evaluation Forum (CLEF) (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Navid Rekabsaz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Rekabsaz, N., Bierig, R., Lupu, M., Hanbury, A. (2017). Toward Optimized Multimodal Concept Indexing. In: Nguyen, N., Kowalczyk, R., Pinto, A., Cardoso, J. (eds) Transactions on Computational Collective Intelligence XXVI. Lecture Notes in Computer Science(), vol 10190. Springer, Cham. https://doi.org/10.1007/978-3-319-59268-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59268-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59267-1

  • Online ISBN: 978-3-319-59268-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics