Abstract
In this chapter, we provide the tools and methodology for comparing the effectiveness of two or more multimedia retrieval systems in a meaningful way. Several aspects of multimedia retrieval systems can be evaluated without consulting the potential users or customers of the system, such as the query processing time (measured for instance in milliseconds per query) or the query throughput (measured for instance as the number of queries per second). In this chapter, however, we will focus on aspects of the system that influence the effectiveness of the retrieved results. In order to measure the effectiveness of search results, one must at some point consult the potential user of the system. For, what are the correct results for the query “black jaguar”? Cars, or cats? Ultimately, the user has to decide.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
A. Amir, J. Argillandery, M. Campbellz, A. Hauboldz, G. Iyengar, S. Ebadollahiz, F. Kangz, M.R. Naphadez, A. Natsevz, J.R. Smithz, J. Tesicz, and T. Volkmer. IBM Research TRECVID-2005 Video Retrieval System. In Proceedings of the TRECVID 2005 workshop, 2005.
R. Baeza-Yates and B. Ribeiro-Neto, editors. Modern Information Retrieval. Addison-Wesley, 1999.
K. Barnard, P. Duygulu, D. Forsyth, N. de Freitas, D. M. Blei, and M. I. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107–1135, 2003.
S. Belongie, C. Carson, H. Greenspan, and J. Malik. Color-and texture-based image segmentation using em and its application to content-based image retrieval. In Proceedings of the sixth International Conference on Computer Vision, 1998.
D.M. Blei and M.I. Jordan. Modeling annotated data. In Proceedings of the 26th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2003.
C. Buckley. trec_eval: TREC evaluation software. In Provided to participants of the Text Retrieval Conferences (TREC), 2006. http://trec.nist.gov/trec_eval/.
P. Clough, H. Müller, and M. Sanderson. The CLEF cross-language image retrieval track (ImageCLEF) 2004. In Proceedings of the fifth Workshop of the Cross Language Evaluation Forum (CLEF), Lecture Notes in Computer Science (LNCS). Springer, 2005.
J.S. Downie, K. West, A. Ehmann, and E. Vincent. The 2005 music information retrieval evaluation exchange (MIREX 2005): preliminary overview. In Proceedings of the Lnternational Conference on Music Information Retrieval, 2005.
P. Duygulu, K. Barnard, N. de Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Proceedings of the seventh European Conference on Computer Vision, pages 97–112, 2002.
Epitonic. http://www.epitonic.com.
J.G. Fiscus, G. Doddington, J.S. Garofolo, and A. Martin. NIST’s 1998 Topic Detection and Tracking Evaluation (TDT2). In Proceedings of the DARPA Broadcast News Workshop, 1999.
J. Garfolo, C. Auzanne, and E.M. Voorhees. The TREC SDR track: A success story. In Proceedings of the eighth Text Retrieval Conference (TREC), pages 107–129, 2000.
E.E.W. Group. Evaluation of natural language processing systems. Technical report, ISSCO, 1996.
D.K. Harman. Appendix B: Common evaluation measures. In Proceedings of the 13th Text Retrieval Conference (TREC), 2005.
D. Hull. Using statistical testing in the evaluation of retrieval experiments. In Proceedings of the 16th ACM Conference on Research and Development in Information Retrieval (SIGIR’93), pages 329–338, 1993.
J. Jeon, V. Lavrenko, and R. Manmatha. Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the 26th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2003.
W. Kraaij, A.F. Smeaton, P. Over, and J. Arlandis. Trecvid — an introduction. In Proceedings of TRECVID 2004, 2004. http://www-nlpir.nist.gov/projects/trecvid/.
J. Li and J.Z. Wang. Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(9), 1075–1088, 2003. http://infolab.stanford.edu/~wangz/project/imsearch/ALIP/PAMI03/01227984.pdf.
Magnatune. http://magnatune.com.
H. Müller, S. Marchand-Maillet, and T. Pun. The truth about Corel — evaluation in image retrieval. In Proceedings of The Challenge of Image and Video Retrieval (CIVR), 2002.
H. Müller, W. Müller, D. McG.Squire, S. Marchand-Maillet, and T. Pun. Performance evaluation in content-based image retrieval: Overview and proposals. Pattern Recognition Letters, 22:593–601, 2001.
P. Over, A. Smeaton, and W. Kraaij. Guidelines for the TRECVID 2004 evaluation. 2004. http://www-nlpir.nist.gov/projects/tv2004/tv2004.html.
S.E. Robertson. Evaluation in information retrieval. In M. Agosti, F. Crestani, and G. Pasi, editors, European Summer School on Information Retrieval (ESSIR), number 1980 in Lecture Notes in Computer Science, pages 81–92. Springer-Verlag, 2000.
A. Rosset, O. Ratib, A. Geissbuhler, and J.P. Vallé. Integration of a multimedia teaching and reference database in a PACS environment. RadioGraphics, 22:1567–1577, 2002.
G. Salton and M.J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, 1983.
J. Smith. Image retrieval evaluation. In Proceedings of the IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL), 1998.
C.G.M. Snoek, J.C. van Gemert, J.M. Geusebroek, B. Huurnink, D.C. Koelma, G.P. Nguyen, O. de Rooij, F.J. Seinstra, A.W.M. Smeulders, C.J. Veenman, and M. Worring. The MediaMill TRECVID 2005 Semantic Video Search Engine. In Proceedings of the 2005 TRECVID workshop, 2005.
J.M. Tague. The pragmatics of information retrieval experimentation. In K. Sparck-Jones, editor, Information Retrieval Experiment, pages 59–102. Butterworths, 1981.
C.J. van Rijsbergen. Information Retrieval, 2nd edition. Butterworths, 1979. http://www.dcs.gla.ac.uk/Keith/Preface.html
N. Vasconcelos and A. Lippman. A probabilistic architecture for content-based image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 216–221, 2000.
E.M. Voorhees and D. Harman, editors. TREC: Experiment and Evaluation in Information Retrieval. MIT Press, 2005.
T. Westerveld and A.P. de Vries. Experimental evaluation of a generative probabilistic image retrieval model on ‘easy’ data. In Proceedings of the Multimedia Information Retrieval Workshop, 2003.
J. Zobel. How reliable are the results of large-scale information retrieval experiments? In Proceedings of the 21st ACM Conference on Research and Development in Information Retrieval (SIGIR’98), pages 307–314, 1998.
T. Westerveld and R. van Zwol. INEX 2006 Multimedia Track. In Advances in XML Information Retrieval and Evaluation: Fifth International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2006, Lecture Notes in Computer Science (LNCS) / Lecture Notes in Artificial Intelligence (LNAI), to appear, Springer, 2007.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Hiemstra, D., Kraaij, W. (2007). Evaluation of Multimedia Retrieval Systems. In: Blanken, H.M., Blok, H.E., Feng, L., de Vries, A.P. (eds) Multimedia Retrieval. Data-Centric Systems and Applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72895-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-540-72895-5_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72894-8
Online ISBN: 978-3-540-72895-5
eBook Packages: Computer ScienceComputer Science (R0)