Skip to main content

Semistructured Data Search Evaluation

  • Chapter
Bridging Between Information Retrieval and Databases (PROMISE 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8173))

Included in the following conference series:

  • 910 Accesses

Abstract

Semistructured data is of increasing importance in many application domains, but one of its core use cases is representing documents. Consequently, effectively retrieving information from semistructured documents is an important problem that has seen work from both the information retrieval (IR) and databases (DB) communities. Comparing the large number of retrieval models and systems is a non-trivial task for which established benchmark initiatives such as TREC with their focus on unstructured documents are not appropriate. This chapter gives an overview of semistructured data in general and the INEX initiative for the evaluation of XML retrieval, focusing on the most prominent Adhoc Search Track.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amer-Yahia, S., Lalmas, M.: XML search: languages, INEX and scoring. SIGMOD Record 35(4), 16–23 (2006)

    Article  Google Scholar 

  2. Arvola, P., Geva, S., Kamps, J., Schenkel, R., Trotman, A., Vainio, J.: Overview of the INEX 2010 ad hoc track. In: Geva, et al. (eds.) [14], pp. 1–32 (2010)

    Google Scholar 

  3. Arvola, P., Kekäläinen, J., Junkkari, M.: Expected reading effort in focused retrieval evaluation. Inf. Retr. 13(5), 460–484 (2010)

    Article  Google Scholar 

  4. Case, P., Dyck, M., Holstege, M., Amer-Yahia, S., Botev, C., Buxton, S., Doerre, J., Melton, J., Rys, M., Shanmugasundaram, J.: XQuery and XPath full text 1.0 (2011), http://www.w3.org/TR/xpath-full-text-10/

  5. Chappell, T., Geva, S.: Overview of the INEX 2012 relevance feedback track. In: Forner, et al. (eds.) [9] (2012)

    Google Scholar 

  6. Demartini, G., Iofciu, T., de Vries, A.P.: Overview of the INEX 2009 entity ranking track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 254–264. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  7. Denoyer, L., Gallinari, P.: The Wikipedia XML corpus. In: Fuhr, et al. (eds.) [12], pp. 12–19

    Google Scholar 

  8. Fetahu, B., Schenkel, R.: Retrieval evaluation on focused tasks. In: Hersh, W.R., Callan, J., Maarek, Y., Sanderson, M. (eds.) SIGIR, pp. 1135–1136. ACM (2012)

    Google Scholar 

  9. Forner, P., Karlgren, J., Womser-Hacker, C. (eds.): CLEF 2012 Evaluation Labs and Workshop, Online Working Notes, Rome, Italy September 17-20 (2012)

    Google Scholar 

  10. Frommholz, I., Larson, R.R.: The heterogeneous collection track at INEX 2006. In: Fuhr, et al. (eds.) [12], pp. 312–317

    Google Scholar 

  11. Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.): INEX 2007. LNCS, vol. 4862. Springer, Heidelberg (2008)

    Google Scholar 

  12. Fuhr, N., Lalmas, M., Trotman, A. (eds.): INEX 2006. LNCS, vol. 4518. Springer, Heidelberg (2007)

    Google Scholar 

  13. (Sandy) Gao, S., Sperberg-McQueen, C.M., Thompson, H.S.: W3C XML schema definition language (XSD) 1.1 part 1: Structures (2012), http://www.w3.org/TR/xmlschema11-1/

  14. Geva, S., Kamps, J., Schenkel, R., Trotman, A. (eds.): INEX 2010. LNCS, vol. 6932. Springer, Heidelberg (2011)

    Google Scholar 

  15. Gövert, N., Fuhr, N., Lalmas, M., Kazai, G.: Evaluating the effectiveness of content-oriented XML retrieval methods. Inf. Retr. 9(6), 699–722 (2006)

    Article  Google Scholar 

  16. Kamps, J., Lalmas, M., Larsen, B.: Evaluation in context. In: Agosti, M., Borbinha, J., Kapidakis, S., Papatheodorou, C., Tsakonas, G. (eds.) ECDL 2009. LNCS, vol. 5714, pp. 339–351. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  17. Kamps, J., Pehcevski, J., Kazai, G., Lalmas, M., Robertson, S.: INEX 2007 evaluation measures. In: Fuhr, et al. (eds.) [11], pp. 24–33 (2007)

    Google Scholar 

  18. Kazai, G., Doucet, A.: Overview of the INEX 2007 book search track (BookSearch’07). In: Fuhr, et al., [11], pp. 148–161

    Google Scholar 

  19. Kazai, G., Gövert, N., Lalmas, M., Fuhr, N.: The INEX evaluation initiative. In: Blanken, H.M., Grabs, T., Schek, H.-J., Schenkel, R., Weikum, G. (eds.) Intelligent Search on XML Data. LNCS, vol. 2818, pp. 279–293. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  20. Kazai, G., Kamps, J., Koolen, M., Milic-Frayling, N.: Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking. In: Ma, W.-Y., Nie, J.-Y., Baeza-Yates, R.A., Chua, T.-S., Bruce Croft, W. (eds.) SIGIR, pp. 205–214. ACM (2011)

    Google Scholar 

  21. Kazai, G., Lalmas, M.: extended cumulated gain measures for the evaluation of content-oriented XML retrieval. ACM Trans. Inf. Syst. 24(4), 503–542 (2006)

    Article  Google Scholar 

  22. Kazai, G., Lalmas, M., de Vries, A.P.: The overlap problem in content-oriented XML retrieval evaluation. In: Sanderson, M., Järvelin, K., Allan, J., Bruza, P. (eds.) SIGIR, pp. 72–79. ACM (2004)

    Google Scholar 

  23. Kazai, G., Lalmas, M., Fuhr, N., Gövert, N.: A report on the first year of the INitiative for the Evaluation of XML retrieval. JASIST 55(6), 551–556 (2004)

    Article  Google Scholar 

  24. Koolen, M., Kazai, G., Kamps, J., Preminger, M., Doucet, A., Landoni, M.: Overview of the INEX 2012 social book search track. In: Forner, et al. (eds.) [9]

    Google Scholar 

  25. Lalmas, M., Tombros, A.: Evaluating XML retrieval effectiveness at INEX. SIGIR Forum 41(1), 40–57 (2007)

    Article  Google Scholar 

  26. Nordlie, R., Pharo, N.: Seven years of INEX interactive retrieval experiments – lessons and challenges. In: Catarci, T., Forner, P., Hiemstra, D., Peñas, A., Santucci, G. (eds.) CLEF 2012. LNCS, vol. 7488, pp. 13–23. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  27. O’Keefe, R.A., Trotman, A.: The simplest query language that could possibly work. In: Proceedings of the 2nd INEX Workshop, pp. 167–174 (2003)

    Google Scholar 

  28. Pal, S., Mitra, M., Kamps, J.: Evaluation effort, reliability and reusability in XML retrieval. JASIST 62(2), 375–394 (2011)

    Article  Google Scholar 

  29. Pehcevski, J., Piwowarski, B.: Evaluation metrics for structured text retrieval. In: Liu, L., Tamer Özsu, M. (eds.) Encyclopedia of Database Systems, pp. 1015–1024. Springer US (2009)

    Google Scholar 

  30. Peterson, D., (Sandy) Gao, S., Malhotra, A., Sperberg-McQueen, C.M., Henry, S. Thompson. W3C XML schema definition language (XSD) 1.1 part 2: Datatypes (2012), http://www.w3.org/TR/xmlschema11-2/

  31. Piwowarski, B.: EPRUM metrics and INEX 2005. In: Fuhr, N., Lalmas, M., Malik, S., Kazai, G. (eds.) INEX 2005. LNCS, vol. 3977, pp. 30–42. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  32. Piwowarski, B., Trotman, A., Lalmas, M.: Sound and complete relevance assessment for XML retrieval. ACM Trans. Inf. Syst. 27(1) (2008)

    Google Scholar 

  33. SanJuan, E., Bellot, P., Moriceau, V., Tannier, X.: Overview of the INEX 2010 question answering track (QA@INEX). In: Geva et al. (eds.)[14,] pp. 269–281 (2010)

    Google Scholar 

  34. SanJuan, E., Moriceau, V., Tannier, X., Bellot, P., Mothe, J.: Overview of the INEX 2012 tweet contextualization track. In: Forner, et al. (eds.) [9]

    Google Scholar 

  35. Schenkel, R., Suchanek, F.M., Kasneci, G.: YAWN: A semantically annotated Wikipedia XML corpus. In: Kemper, A., Schöning, H., Rose, T., Jarke, M., Seidl, T., Quix, C., Brochhaus, C. (eds.) BTW. LNI, vol. 103, pp. 277–291. GI (2007)

    Google Scholar 

  36. Thom, J.A., Wu, C.: Overview of the INEX 2010 web service discovery track. In: Geva, et al. (eds.) [14], pp. 332–335

    Google Scholar 

  37. Trappett, M., Geva, S., Trotman, A., Scholer, F., Sanderson, M.: Overview of the INEX 2012 snippet retrieval track. In: Forner, et al. (eds.) [9]

    Google Scholar 

  38. Trotman, A., Alexander, D., Geva, S.: Overview of the INEX 2010 link the wiki track. In: Geva, et al. (eds.) [14], pp. 241–249

    Google Scholar 

  39. Trotman, A., Lalmas, M.: Why structural hints in queries do not help XML-retrieval. In: Efthimiadis, E.N., Dumais, S.T., Hawking, D., Järvelin, K. (eds.) SIGIR, pp. 711–712. ACM (2006)

    Google Scholar 

  40. Trotman, A., Sigurbjörnsson, B.: Narrowed Extended XPath I (NEXI). In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, pp. 16–40. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  41. Tsikrika, T., Kludas, J.: Overview of the WikipediaMM Task at ImageCLEF 2008. In: Peters, C., et al. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 539–550. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  42. Tsikrika, T., Westerveld, T.: The INEX 2007 multimedia track. In: Fuhr, et al. (eds.) [11], pp. 440–453

    Google Scholar 

  43. De Vries, C.M., Nayak, R., Kutty, S., Geva, S., Tagarelli, A.: Overview of the INEX 2010 XML mining track: Clustering and classification of XML documents. In: Geva, et al. (eds.) [14], pp. 363–376

    Google Scholar 

  44. Wang, Q., Kamps, J., Camps, G.R., Marx, M., Schuth, A., Theobald, M., Gurajada, S., Mishra, A.: Overview of the INEX 2012 linked data track. In: Forner, et al. (eds.) [9]

    Google Scholar 

  45. Wang, Q., Ramírez, G., Marx, M., Theobald, M., Kamps, J.: Overview of the INEX 2011 data-centric track. In: Geva, S., Kamps, J., Schenkel, R. (eds.) INEX 2011. LNCS, vol. 7424, pp. 118–137. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Schenkel, R. (2014). Semistructured Data Search Evaluation. In: Ferro, N. (eds) Bridging Between Information Retrieval and Databases. PROMISE 2013. Lecture Notes in Computer Science, vol 8173. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54798-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-54798-0_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-54797-3

  • Online ISBN: 978-3-642-54798-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics