Skip to main content
Log in

Simulated evaluation of faceted browsing based on feature selection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper we explore the limitations of facet based browsing which uses sub-needs of an information need for querying and organising the search process in video retrieval. The underlying assumption of this approach is that the search effectiveness will be enhanced if such an approach is employed for interactive video retrieval using textual and visual features. We explore the performance bounds of a faceted system by carrying out a simulated user evaluation on TRECVid data sets, and also on the logs of a prior user experiment with the system. We first present a methodology to reduce the dimensionality of features by selecting the most important ones. Then, we discuss the simulated evaluation strategies employed in our evaluation and the effect on the use of both textual and visual features. Facets created by users are simulated by clustering video shots using textual and visual features. The experimental results of our study demonstrate that the faceted browser can potentially improve the search effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. www.youtube.com

  2. www.dailymotion.com

  3. TRECVid is a large scale evaluation campaign aiming at research problems related with video data.

  4. http://www.chiariglione.org/mpeg/

References

  1. Bekkerman R, McCallum A, Huang G (2005) Automatic categorization of email into folders: bechmark experiments on enron and sri corpora. Technical report, Department of Computer Science. Amherst, University of Massachusetts

  2. Bermejo P, Gámez J, Puerta J, Uribe R (2008) Improving knn-based e-mail classification into folders generating class-balanced datasets. In: IPMU’08: proceedings of the 12th intl. conf. on information processing and management of uncertainty in knowledge-based systems

  3. Finin TW (1989) GUMS: a general user modeling shell. User models in dialog systems, pp 411–430

  4. Flores MJ, Gámez JA, Mateo JL (2007) Mining the esrom: a study of breeding value classification in manchego sheep by means of attribute selection and construction. Comput Electron Agric 60(2):167–177

    Article  Google Scholar 

  5. Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305

    Article  MATH  Google Scholar 

  6. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    Article  MATH  Google Scholar 

  7. Halvey MJ, Keane MT (2007) Analysis of online video search and sharing. In: HT ’07: proceedings of the eighteenth conference on hypertext and hypermedia. ACM, New York, pp 217–226

    Chapter  Google Scholar 

  8. Harper DJ, Kelly D (2006) Contextual relevance feedback. In: IIiX: proceedings of the 1st international conference on information interaction in context. ACM, New York, pp 129–137

    Chapter  Google Scholar 

  9. Hersh W, Over P (2000) TREC-8 interactive track report. In: The eighth text retrieval conference (TREC 8)

  10. Hopfgartner F, Jose J (2007) Evaluating the implicit feedback models for adaptive video retrieval. In: ACM MIR ’07, September, pp 323–332

  11. Hu Y-J (1998) Constructive induction: covering attribute spectrum. In: Feature extraction, construction and selection: a data mining perspective. Kluwer, Dordrecht

    Google Scholar 

  12. Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  13. Kerne A, Koh E, Smith S, Choi H, Graeber R, Webb A (2007) Promoting emergence in information discovery by representing collections with composition. In: C&C ’07: proceedings of the 6th ACM SIGCHI conference on creativity & cognition. ACM, New York, pp 117–126

    Chapter  Google Scholar 

  14. Larsen O, Freitas A, Nievola J (2002) Constructing x-of-n attributes with a genetic algorithm. In: Proc genetic and evolutionary computation conf (GECCO-2002)

  15. Liu H, Motoda H (1998) Feature extraction construction and selection: a data mining perspective. Kluwer, Dordrecht

    MATH  Google Scholar 

  16. Nakazato N, Manola L, Huang TS (2002) Extending image retrieval with group-oriented interface. In: Proceedings of advanced visual interfaces

  17. Over P (1999) TREC-5 interactive track report. In: The seventh text retrieval conference (TREC 5)

  18. Quinlan J (1986) Induction of decision trees. Mach Learn 1:81–106

    Google Scholar 

  19. Robertson SE, Walker S, Jones S, Hancock-Beaulieu M, Gatford M (1994) Okapi at TREC-3. In: Proceedings of the third text retrieval conference (TREC 1994), Gaithersburg

  20. Rudinac S, Zajic G, Uscumlic M, Rudinac M, Reljin B (2007) Comparison of cbir systems with different number of feature vector components. In: SMAP ’07: proceedings of the second international workshop on semantic media adaptation and personalization. IEEE Computer Society, Washington, DC, pp 199–204

  21. Shen HT, Ooi BC, Zhou X (2005) Towards effective indexing for very large video sequence database. In: Proceedings of the ACM SIGMOD international conference on management of data. ACM, Baltimore, pp 730–741

    Chapter  Google Scholar 

  22. Smeaton AF, Over P, Kraaij W (2006) Evaluation campaigns and trecvid. In: MIR ’06: proceedings of the 8th ACM international workshop on multimedia information retrieval. ACM, New York, pp 321–330

    Chapter  Google Scholar 

  23. Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380

    Article  Google Scholar 

  24. Spärck-Jones K, van Rijsbergen CJ (1975) Report on the need for and provision of an ideal information retrieval test collection. Technical report, University Computer Laboratory, Cambridge, British Library Research and Development report 5266

  25. Urban J, Jose JM (2004) Ego: a personalised multimedia management tool. In: Proc. of the 2nd int. workshop on adaptive multimedia retrieval, pp 3–17

  26. Urruty T, Djeraba C, Jose JM (2008) An efficient indexing structure for multimedia data. In: MIR08: international conference on multimedia information retrieval. ACM, New York, pp 313–320

    Chapter  Google Scholar 

  27. van Rijsbergen CJ (1979) Information retrieval, 2nd edn. Butterworths, London

    Google Scholar 

  28. Villa R, Gildea N, Jose JM (2008) A faceted search interface for multimedia retrieval. In: SIGIR’08, Singapore, pp 775–776

  29. Villa R, Gildea N, Jose JM (2008) FacetBrowser: a user interface for complex search tasks. In: MM’08: international conference on multimedia, Vancouver, pp 489–498

  30. Webb G, Boughton J, Wang Z (2005) Not so naive bayes: aggregating one-dependence estimators. Mach Learn 58(1):5–24

    Article  MATH  Google Scholar 

  31. Weber R, Schek H-J, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of 24rd international conference on very large data bases. Morgan Kaufmann, New York, pp 194–205

    Google Scholar 

  32. White AP, Liu WZ (1994) Technical note: bias in information-based measures in decision tree induction. Mach Learn 15(3):321–329

    MATH  Google Scholar 

  33. White R, Bilenko M, Cucerzan S (2007) Studying the use of popular destinations to enhance web search interaction. In: ACM SIGIR ’07—proceedings of the 30th international ACM SIGIR conference, Amsterdam, July, pp 159–166

Download references

Acknowledgements

This research was supported by the European Commission under contract FP6-027122-SALERO. The third author was supported by the JCCM under project (PCI08-0048-8577), MEC under project (TIN2007-67418-C03-01) and FEDER funds. It is the view of the authors but not necessarily the view of the community.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank Hopfgartner.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hopfgartner, F., Urruty, T., Lopez, P.B. et al. Simulated evaluation of faceted browsing based on feature selection. Multimed Tools Appl 47, 631–662 (2010). https://doi.org/10.1007/s11042-009-0340-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-009-0340-6

Keywords

Navigation