Combining Visual and Textual Systems within the Context of User Feedback

Kaliciak, Leszek; Song, Dawei; Wiratunga, Nirmalie; Pan, Jeff

doi:10.1007/978-3-642-35725-1_41

Leszek Kaliciak⁷,
Dawei Song⁸,
Nirmalie Wiratunga⁷ &
…
Jeff Pan⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7732))

Included in the following conference series:

International Conference on Multimedia Modeling

2234 Accesses
3 Citations

Abstract

It has been proven experimentally, that a combination of textual and visual representations can improve the retrieval performance ([20], [23]). It is due to the fact, that the textual and visual feature spaces often represent complementary yet correlated aspects of the same image, thus forming a composite system.

In this paper, we present a model for the combination of visual and textual sub-systems within the user feedback context. The model was inspired by the measurement utilized in quantum mechanics (QM) and the tensor product of co-occurrence (density) matrices, which represents a density matrix of the composite system in QM. It provides a sound and natural framework to seamlessly integrate multiple feature spaces by considering them as a composite system, as well as a new way of measuring the relevance of an image with respect to a context. The proposed approach takes into account both intra (via co-occurrence matrices) and inter (via tensor operator) relationships between features’ dimensions. It is also computationally cheap and scalable to large data collections. We test our approach on ImageCLEF2007photo data collection and present interesting findings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Zhao, R., Grosky, W.I.: Narrowing the semantic gap-improved text-based web document retrieval using visual features. IEEE Transactions on Multimedia 4, 189–200 (2002)
Article Google Scholar
Ferecatu, M., Sahbi, H.: TELECOM ParisTech at Image Clef photo 2008: Bi-modal text and image retrieval with diversity enhancement. In: Working Notes of CLEF (2008)
Google Scholar
Martinez-Fernandes, J.L., Serrano, A.G., Villena-Roman, J., Saenz, V.D.M., Tortosa, S.G., Castagnone, M., Alonso, J.: MIRACLE at ImageCLEF 2004. In: Working Notes of CLEF (2004)
Google Scholar
Yanai, K.: Generic image classification using visual knowledge on the web. In: Proceedings of the 11th ACM International Conference on Multimedia, pp. 167–176 (2003)
Google Scholar
Tjondronegoro, D., Zhang, J., Gu, J., Nguyen, A., Geva, S.: Integrating Text Retrieval and Image Retrieval in XML Document Searching. In: Fuhr, N., Lalmas, M., Malik, S., Kazai, G. (eds.) INEX 2005. LNCS, vol. 3977, pp. 511–524. Springer, Heidelberg (2006)
Google Scholar
Maillot, N., Chevallet, J.P., Valea, V., Lim, J.H.: IPAL Inter-media pseudo-relevance feedback approach to ImageCLEF 2006 photo retrieval. In: CLEF Working Notes (2006)
Google Scholar
Rahman, M.M., Bhattacharya, P., Desai, B.C.: A unified image retrieval framework on local visual and semantic concept-based feature spaces. J. Visual Communication and Image Representation 20(7), 450–462 (2009)
Article Google Scholar
Simpson, M., Rahaman, M.M.: Text and content-based approaches to image retrieval for the ImageCLEF 2009 medical retrieval track. In: Working Notes for the CLEF 2009 Workshop (2009)
Google Scholar
Wang, J., Song, D., Kaliciak, L.: Tensor product of correlated text and visual features: a quantum theory inspired image retrieval framework. In: AAAI-Fall 2010 Symposium on Quantum Information for Cognitive, Social, and Semantic Processes, pp. 109–116 (2010)
Google Scholar
Mensink, T., Csurka, G., Perronnin, F.: LEAR and XRCE’s participation to visual concept detection task - ImageCLEF 2010. In: Proceedings of the 14th Annual ACM International Conference on Multimedia, pp. 77–80 (2006)
Google Scholar
Mensink, T., Verbeek, J., Csurkay, G.: Weighted transmedia relevance feedback for image retrieval and auto-annotation. Technical Report Number 0415 (2011)
Google Scholar
Clinchant, S., Ah-Pine, J., Csurka, G.: Semantic combination of textual and visual information in multimedia retrieval. In: ACM International Conference on Multimedia Retrieval, ICMR (2011)
Google Scholar
Depeursinge, A., Muller, H.: Fusion techniques for combining textual and visual information retrieval. In: ImageCLEF. The Springer International Series on Information Retrieval, vol. 32, pp. 95–114 (2010)
Chapter Google Scholar
Chang, Y.-C., Chen, H.-H.: Increasing Precision and Diversity in Photo Retrieval by Result Fusion. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 612–619. Springer, Heidelberg (2009)
Chapter Google Scholar
Combining systems: the tensor product and partial trace, http://www.quantum.umb.edu/Jacobs/QMT/QMT-AppendixA.pdf
Li, Y., Cunningham, H.: Geometric and quantum methods for information retrieval. SIGIR Forum 42(2), 22–32 (2008)
Article Google Scholar
van Rijsbergen, C.J.: The geometry of information retrieval. Cambridge University Press (2004)
Google Scholar
Bruza, P.D., Kitto, K., Nelson, D., McEvoy, C.L.: Entangling words and meaning. In: Proceedings of the 2nd Quantum Interaction Symposium, pp. 118–124 (2008)
Google Scholar
Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)
Article Google Scholar
Grubinger, M., Clough, P., Hanbury, A., Muller, H.: Overview of the ImageCLEF 2007 photographic retrieval task. In: Working Notes of the 2007 CLEF Workshop (2007)
Google Scholar
Kaliciak, L., Song, D., Wiratunga, N., Pan, J.: Novel local features with hybrid sampling technique for image retrieval. In: Proceedings of Conference on Information and Knowledge Management (CIKM), pp. 1557–1560 (2010)
Google Scholar
Nowak, E., Jurie, F., Triggs, B.: Sampling Strategies for Bag-of-Features Image Classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 490–503. Springer, Heidelberg (2006)
Chapter Google Scholar
ImageCLEF website, http://www.imageclef.org
Grubinger, M., Clough, P., Hanbury, A., Müller, H.: Overview of the ImageCLEFphoto 2007 Photographic Retrieval Task. In: Peters, C., Jijkoun, V., Mandl, T., Müller, H., Oard, D.W., Peñas, A., Petras, V., Santos, D. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 433–444. Springer, Heidelberg (2008)
Chapter Google Scholar
Chen, Z., Liu, W., Zhang, F., Li, M.J., Zhang, H.J.: Web mining for web image retrieval. Journal of the American Society for Information Science and Technology 52(10), 831–839 (2001)
Article Google Scholar

Download references

Author information

Authors and Affiliations

The Robert Gordon University, Aberdeen, UK
Leszek Kaliciak & Nirmalie Wiratunga
The Open University, Milton Keynes, UK
Dawei Song
Aberdeen University, Aberdeen, UK
Jeff Pan

Authors

Leszek Kaliciak
View author publications
You can also search for this author in PubMed Google Scholar
Dawei Song
View author publications
You can also search for this author in PubMed Google Scholar
Nirmalie Wiratunga
View author publications
You can also search for this author in PubMed Google Scholar
Jeff Pan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Asia, 5 Danling Street, 100080, Beijing, China
Shipeng Li & Tao Mei &
School of Electrical Engineering and Computer Science, University of Ottawa, 800 King Edward, K1N 6N5, Ottawa, ON, Canada
Abdulmotaleb El Saddik
School of Computer and Information, Hefei University of Technology, Road Tunxi 193#, 230009, Hefei, Anhui, China
Meng Wang & Richang Hong &
Department of Information Engineering and Computer Science, University of Trento, ommarive 14, 38100, Trento, Italy
Nicu Sebe
Department of Electrical and Computer Engineering, National University of Singapore, 4 Engineering Drive 3, 117583, Singapore, Singapore
Shuicheng Yan
School of Computing, CLARITY: Centre for Sensor Web Technologies, Dublin City University, Glasnevin, Dublin 9, Ireland
Cathal Gurrin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kaliciak, L., Song, D., Wiratunga, N., Pan, J. (2013). Combining Visual and Textual Systems within the Context of User Feedback. In: Li, S., et al. Advances in Multimedia Modeling. MMM 2013. Lecture Notes in Computer Science, vol 7732. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35725-1_41

Download citation

DOI: https://doi.org/10.1007/978-3-642-35725-1_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35724-4
Online ISBN: 978-3-642-35725-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics