Skip to main content

Measuring Effectiveness in the TREC Legal Track

  • Chapter
  • First Online:
Current Challenges in Patent Information Retrieval

Part of the book series: The Information Retrieval Series ((INRE,volume 37))

  • 1561 Accesses

Abstract

In this chapter, we report our experiences from attempting to measure the effectiveness of large electronic discovery (e-Discovery) result sets in the Text Retrieval Conference (TREC) Legal Track campaigns of 2006–2011. For effectiveness measures, we have focused on recall, precision and F 1. We state the estimators that we have used for these measures, and we outline both the rank-based and set-based approaches to sampling that we have taken. We share our experiences with the sampling error in the resulting estimates for the absolute effectiveness on individual topics, relative effectiveness on individual topics, mean effectiveness across topics and relative effectiveness across topics. Finally, we discuss our experiences with assessor error, which we have found has often had a larger impact than sampling error.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allan J, Carterette B, Dachev B et al (2008) Million query track 2007 overview. In: Proceedings of TREC 2007. http://trec.nist.gov/pubs/trec16/papers/1MQ.OVERVIEW16.pdf

    Google Scholar 

  2. Baron JR (ed) (2007) The Sedona conference®; best practices commentary on the use of search and information retrieval methods in E-discovery. In: The Sedona conference journal, vol VIII, pp 189–223

    Google Scholar 

  3. Baron JR, Lewis DD, Oard DW (2007) TREC-2006 legal track overview. In: Proceedings of TREC 2006. http://trec.nist.gov/pubs/trec15/papers/LEGAL06.OVERVIEW.pdf

    Google Scholar 

  4. Buckley C, Voorhees EM (2005) Retrieval system evaluation. In: TREC: experiment and evaluation in information retrieval, pp 53–75

    Google Scholar 

  5. Buckley C, Dimmick D, Soboroff I, Voorhees E (2006) Bias and the limits of pooling. In: Proceedings of SIGIR 2006, pp 619–620

    Google Scholar 

  6. Büttcher S, Clarke CLA, Soboroff I (2007) The TREC 2006 terabyte track. In: Proceedings of TREC 2006. http://trec.nist.gov/pubs/trec15/papers/TERA06.OVERVIEW.pdf

    Google Scholar 

  7. Carterette B, Soboroff I (2010) The effect of assessor errors on IR system evaluation. In: Proceedings of SIGIR 2010, pp 539–546

    Article  Google Scholar 

  8. Cormack GV, Grossman MR, Hedin B, Oard DW (2011) Overview of the TREC 2010 legal track. In: Proceedings of TREC 2010. http://trec.nist.gov/pubs/trec19/papers/LEGAL10.OVERVIEW.pdf

    Google Scholar 

  9. Devore J, Farnum N (2005) Applied statistics for engineers and scientists, 2nd edn. Thomson Brooks/Cole, Belmont, CA

    Google Scholar 

  10. Grossman MR, Cormack GV (2012) Inconsistent responsiveness determination in document review: difference of opinion or human error? Pace Law Review 32(2, Spring):267–288

    Google Scholar 

  11. Grossman MR, Cormack GV, Hedin B, Oard DW (2012) Overview of the TREC 2011 legal track. In: Proceedings of TREC 2011. http://trec.nist.gov/pubs/trec20/papers/LEGAL.OVERVIEW.2011.pdf

    Google Scholar 

  12. Harman DK (2005) The TREC test collections. In: TREC: experiment and evaluation in information retrieval, pp 21–52

    Google Scholar 

  13. Hedin B, Tomlinson S, Baron JR, Oard DW (2010) Overview of the TREC 2009 legal track. In: Proceedings of TREC 2009. http://trec.nist.gov/pubs/trec18/papers/LEGAL09.OVERVIEW.pdf

    Google Scholar 

  14. Lewis D, Agam G, Argamon S et al (2006) Building a test collection for complex document information processing. In: Proceedings of SIGIR 2006, pp 665–666

    Google Scholar 

  15. Oard DW, Webber W (2013) Information retrieval for E-discovery. Found Trends Inf Retr 7(2–3):99–237

    Article  Google Scholar 

  16. Oard DW, Hedin B, Tomlinson S, Baron JR (2009) Overview of the TREC 2008 legal track. In: Proceedings of TREC 2008. http://trec.nist.gov/pubs/trec17/papers/LEGAL.OVERVIEW08.pdf

    Google Scholar 

  17. Oard DW, Baron JR, Hedin B et al (2010) Evaluation of information retrieval for E-discovery. Artif Intell Law 18(4):347–386

    Article  Google Scholar 

  18. Sanderson M, Zobel J (2005) Information retrieval system evaluation: effort, sensitivity, and reliability. In: Proceedings of SIGIR 2005, pp 162–169

    Google Scholar 

  19. Taylor JR (1997) Error analysis: the study of uncertainties in physical measurements. University Science Book, Sausalito, CA

    Google Scholar 

  20. Thompson SK (2002) Sampling, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  21. Tomlinson S (2007) Experiments with the negotiated Boolean queries of the TREC 2006 legal discovery track. In: Proceedings of TREC 2006. http://trec.nist.gov/pubs/trec15/papers/opentext.legal.final.pdf

    Google Scholar 

  22. Tomlinson S (2008) Experiments with the negotiated Boolean queries of the TREC 2007 legal discovery track. In: Proceedings of TREC 2007. http://trec.nist.gov/pubs/trec16/papers/open-text.legal.final.pdf

    Google Scholar 

  23. Tomlinson S (2009) Experiments with the negotiated Boolean queries of the TREC 2008 legal track. In: Proceedings of TREC 2008. http://trec.nist.gov/pubs/trec17/papers/open-text.legal.rev.pdf

    Google Scholar 

  24. Tomlinson S, Oard DW, Baron JR, Thompson P (2008) Overview of the TREC 2007 legal track. In: Proceedings of TREC 2007. http://trec.nist.gov/pubs/trec16/papers/LEGAL.OVERVIEW16.pdf

    Google Scholar 

  25. TREC Legal Track (web site). Last visited January 2017. http://trec-legal.umiacs.umd.edu/

  26. van Rijsbergen, CJ (1979) Information retrieval, 2nd ed. Butterworths, London. http://www.dcs.gla.ac.uk/Keith/Preface.html

    MATH  Google Scholar 

  27. Vinjumur JK, Oard DW, Paik JH (2014) Assessing the reliability and reusability of an E-discovery privilege test collection. In: Proceedings of SIGIR 2014, pp 1047–1050

    Google Scholar 

  28. Voorhees EM (2000) Variations in relevance judgments and the measurement of retrieval effectiveness. Inf Process Manage 36(5):697–716

    Article  Google Scholar 

  29. Voorhees EM, Harman D (1997) Overview of the fifth Text REtrieval Conference (TREC-5). In: Proceedings of TREC-5. http://trec.nist.gov/pubs/trec5/papers/overview.ps.gz

  30. Webber W (2013) Approximate recall confidence intervals. ACM Trans Inf Syst 31(1):1–33, article no 2

    Google Scholar 

  31. Webber W, Oard DW, Scholer F, Hedin B (2010) Assessor error in stratified evaluation. In: Proceedings of CIKM 2010, pp 539–548

    Google Scholar 

  32. Yilmaz E, Aslam JA (2006) Estimating average precision with incomplete and imperfect judgments. In: Proceedings of CIKM 2006, pp 102–111

    Google Scholar 

  33. Zobel J (1998) How reliable are the results of large-scale information retrieval experiments? In: Proceedings of SIGIR 1998, pp 307–314

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephen Tomlinson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer-Verlag GmbH Germany

About this chapter

Cite this chapter

Tomlinson, S., Hedin, B. (2017). Measuring Effectiveness in the TREC Legal Track. In: Lupu, M., Mayer, K., Kando, N., Trippe, A. (eds) Current Challenges in Patent Information Retrieval. The Information Retrieval Series, vol 37. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53817-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-53817-3_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-53816-6

  • Online ISBN: 978-3-662-53817-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics