Advertisement

Evaluating sentence-level relevance feedback for high-recall information retrieval

  • Haotian ZhangEmail author
  • Gordon V. Cormack
  • Maura R. Grossman
  • Mark D. Smucker
Article
  • 17 Downloads

Abstract

This study uses a novel simulation framework to evaluate whether the time and effort necessary to achieve high recall using active learning is reduced by presenting the reviewer with isolated sentences, as opposed to full documents, for relevance feedback. Under the weak assumption that more time and effort is required to review an entire document than a single sentence, simulation results indicate that the use of isolated sentences for relevance feedback can yield comparable accuracy and higher efficiency, relative to the state-of-the-art baseline model implementation (BMI) of the AutoTAR continuous active learning (“CAL”) method employed in the TREC 2015 and 2016 Total Recall Track.

Keywords

Continuous active learning CAL Technology-assisted review TAR Total Recall Relevance feedback 

Notes

Acknowledgements

Funding was provided by Natural Sciences and Engineering Research Council of Canada (Grant Nos. CRDPJ 468812-14, RGPIN-2017-04239, RGPIN-2014-03642).

References

  1. Aalbersberg, I. J. (1992). Incremental relevance feedback. In Proceedings of the 15th annual international ACM SIGIR conference on research and development in information retrieval (pp. 11–22). ACM.Google Scholar
  2. Abualsaud, M., Cormack, G. V., Ghelani, N., Ghenai, A., Grossman, M. R., Rahbariasl, S., Smucker, M. D., & Zhang, H. (2018a). UWaterlooMDS at the TREC 2018 common core track. TREC.Google Scholar
  3. Abualsaud, M., Ghelani, N., Zhang, H., Smucker, M.D., Cormack, G.V., & Grossman, M.R. (2018b). A system for efficient high-recall retrieval. In The 41st international ACM SIGIR conference on research & development in information retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08–12, 2018 (pp. 1317–1320).  https://doi.org/10.1145/3209978.3210176.
  4. Allan, J. (2005). Hard track overview in TREC 2003 high accuracy retrieval from documents. Tech. rep., DTIC Document.Google Scholar
  5. Baron, J. R., Lewis, D. D., & Oard, D. W. (2006). TREC 2006 legal track overview. In TREC.Google Scholar
  6. Baruah, G., Zhang, H., Guttikonda, R., Lin, J., Smucker, M. D., & Vechtomova, O. (2016). Optimizing nugget annotations with active learning. In Proceedings of the 25th ACM international on conference on information and knowledge management (pp 2359–2364). ACM.Google Scholar
  7. Blair, D. C., & Maron, M. E. (1985). An evaluation of retrieval effectiveness for a full-text document-retrieval system. Communications of the ACM, 28(3), 289–299.CrossRefGoogle Scholar
  8. Büttcher, S., Clarke, C. L., & Soboroff, I. (2006). The TREC 2006 terabyte track. In TREC (Vol. 6, p. 39).Google Scholar
  9. Büttcher, S., Clarke, C. L. A., & Yeung, P. C. K., Soboroff, I. (2007). Reliable information retrieval evaluation with incomplete and biased judgements. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, ACM, New York, SIGIR ’07 (pp. 63–70).  https://doi.org/10.1145/1277741.1277755.
  10. Clarke, C. L., Scholer, F., & Soboroff, I. (2005). The TREC 2005 terabyte track. In TREC.Google Scholar
  11. Cormack, G. V., & Grossman, M. R. (2014). Evaluation of machine-learning protocols for technology-assisted review in electronic discovery. In Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval, (pp. 153–162). ACM.Google Scholar
  12. Cormack, G. V., & Grossman, M. R. (2015). Autonomy and reliability of continuous active learning for technology-assisted review. InCoRR. arxiv:abs/1504.06868.
  13. Cormack, G. V., & Grossman, M. R. (2016). Scalability of continuous active learning for reliable high-recall text classification. In Proceedings of the 25th ACM international on conference on information and knowledge management (pp 1039–1048). ACM.Google Scholar
  14. Cormack, G. V., & Grossman, M. R. (2017a). Navigating imprecision in relevance assessments on the road to total recall: Roger and me. In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval (pp. 5–14). ACM.Google Scholar
  15. Cormack, G. V., & Grossman, M. R. (2017b). Technology-assisted review in empirical medicine: Waterloo participation in CLEF ehealth 2017. In Working Notes of CLEF 2017—Conference and Labs of the Evaluation Forum, Dublin, Ireland, September 11–14, 2017.Google Scholar
  16. Cormack, G. V., & Lynam, T. R. (2005a). Spam corpus creation for TREC. In CEAS.Google Scholar
  17. Cormack, G. V., & Lynam, T. R. (2005b). TREC 2005 spam track overview. In TREC (pp. 500–274).Google Scholar
  18. Cormack, G. V., & Mojdeh, M. (2009). Machine learning for information retrieval: TREC 2009 web, relevance feedback and legal tracks. In TREC.Google Scholar
  19. Cormack, G. V., Grossman, M. R., Hedin, B., & Oard, D. W. (2010). Overview of the TREC 2010 legal track. In Proceedings of 19th text REtrieval conference (p. 1).Google Scholar
  20. Cormack, G. V., Palmer, C. R., & Clarke, C. L. (1998). Efficient construction of large test collections. In Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval (pp. 282–289). ACM.Google Scholar
  21. Drucker, H., Shahrary, B., & Gibbon, D. C. (2001). Relevance feedback using support vector machines. In ICML (pp. 122–129).Google Scholar
  22. Dumais, S. (2005). The interactive TREC Track: Putting the user into search. MIT Press. https://www.microsoft.com/en-us/research/publication/interactive-trec-track-putting-user-search/
  23. Grossman, M., Cormack, G., & Roegiest, A. (2016). TREC 2016 total recall track overview. In Proceedings of TREC-2016.Google Scholar
  24. Grossman, M. R., Cormack, G. V., Hedin, B., & Oard, D. W. (2011). Overview of the TREC 2011 legal track. In TREC (Vol. 11).Google Scholar
  25. Grossman, M. R., & Cormack, G. V. (2011). Technology-assisted review in e-discovery can be more effective and more efficient than exhaustive manual review. Richmond Journal of Law and Technology, 17, 1.Google Scholar
  26. Hearst, M. (2009). Search user interfaces. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  27. Hedin, B., Tomlinson, S., Baron, J. R., & Oard, D. W. (2009). Overview of the TREC 2009 legal track. Tech. rep., NATIONAL ARCHIVES AND RECORDS ADMINISTRATION COLLEGE PARK MD.Google Scholar
  28. Hersh, W. R., & Bhupatiraju, R. T. (2003). TREC genomics track overview. In TREC (Vol. 2003, pp. 14–23).Google Scholar
  29. Hogan, C., Reinhart, J., Brassil, D., Gerber, M., Rugani, S. M., & Jade, T. (2008). H5 at TREC 2008 legal interactive: User modeling, assessment & measurement. Tech. rep., H5 SAN FRANCISCO CA.Google Scholar
  30. Kanoulas, E., Li, D., Azzopardi, L., & Spijker, R. (2017). CLEF 2017 technologically assisted reviews in empirical medicine overview. Working Notes of CLEF (pp. 11–14).Google Scholar
  31. Kolcz, A., & Cormack, G. V. (2009). Genre-based decomposition of email class noise. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 427–436). ACM.Google Scholar
  32. Liu, X., & Croft, W. B. (2002). Passage retrieval based on language models. In Proceedings of the eleventh international conference on Information and knowledge management (pp. 375–382). ACM.Google Scholar
  33. Maddalena, E., Basaldella, M., De Nart, D., Degl’Innocenti, D., Mizzaro, S., & Demartini, G. (2016). Crowdsourcing relevance assessments: The unexpected benefits of limiting the time to judge. In Fourth AAAI conference on human computation and crowdsourcing.Google Scholar
  34. Oard, D. W., Hedin, B., Tomlinson, S., & Baron, J. R. (2008). Overview of the TREC 2008 legal track. Tech. rep.: MARYLAND UNIV COLLEGE PARK COLL OF INFORMATION STUDIES.Google Scholar
  35. Over, P. (2001). The trec interactive track: An annotated bibliography. Information Processing & Management, 37(3), 369–381.CrossRefzbMATHGoogle Scholar
  36. Pickens, J., Gricks, T., Hardi, B., & Noel, M. (2015). A constrained approach to manual total recall. In Proceedings of The twenty-fourth text REtrieval conference, TREC 2015, Gaithersburg, Maryland, November 17–20, 2015. http://trec.nist.gov/pubs/trec24/papers/catres-TR.pdf.
  37. Rahbariasl, S. (2018). The effects of time constraints and document excerpts on relevance assessing behavior. Master’s thesis. http://hdl.handle.net/10012/13678.
  38. Robertson, S. E., & Soboroff, I. (2002). The TREC 2002 filtering track report. In TREC (Vol. 2002, p. 5).Google Scholar
  39. Roegiest, A., Cormack, G., Grossman, M., & Clarke, C. (2015). TREC 2015 total recall track overview. In Proceedings of TREC-2015.Google Scholar
  40. Ruthven, I., & Lalmas, M. (2003). A survey on the use of relevance feedback for information access systems. The Knowledge Engineering Review, 18(2), 95–145.CrossRefGoogle Scholar
  41. Salton, G., Allan, J., & Buckley, C. (1993). Approaches to passage retrieval in full text information systems. In Proceedings of the 16th annual international ACM SIGIR conference on research and development in information retrieval (pp. 49–58). ACM.Google Scholar
  42. Sanderson, M. (1998). Accurate user directed summarization from existing tools. In Proceedings of the seventh international conference on Information and knowledge management (pp. 45–51). ACM.Google Scholar
  43. Sanderson, M., & Joho, H. (2004). Forming test collections with no system pooling. In Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval (pp. 33–40). ACM.Google Scholar
  44. Smucker, M. D., & Jethani, C. P. (2010). Human performance and retrieval precision revisited. In Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval (pp. 595–602). ACM.Google Scholar
  45. Soboroff, I., & Robertson, S. (2003). Building a filtering test collection for TREC 2002. In Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval (pp. 243–250). ACM.Google Scholar
  46. Sparck Jones, K., & Van Rijsbergen, C. (1975). Report on the need for and provision of an ‘ideal’ information retrieval test collection. Computer Laboratory.Google Scholar
  47. Tombros, A., & Sanderson, M. (1998). Advantages of query biased summaries in information retrieval. In Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval (pp. 2–10). ACM.Google Scholar
  48. Tomlinson, S., Oard, D. W., Baron, J, R., & Thompson, P. (2007). Overview of the TREC 2007 legal track. In TREC.Google Scholar
  49. Voorhees, E. M., & Harman, D. (2000). Overview of the eighth text retrieval conference (TREC-8) (pp. 1–24).Google Scholar
  50. Voorhees, E. M. (2000). Variations in relevance judgments and the measurement of retrieval effectiveness. Information Processing & Management, 36(5), 697–716.CrossRefGoogle Scholar
  51. Voorhees, E. M., Harman, D. K., et al. (2005). TREC: Experiment and evaluation in information retrieval (Vol. 1). Cambridge: MIT Press.Google Scholar
  52. Wallace, B. C., Small, K., Brodley, C. E., & Trikalinos, T. A. (2010). Active learning for biomedical citation screening. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 173–182). ACM.Google Scholar
  53. Wallace, B. C., Dahabreh, I. J., Schmid, C. H., Lau, J., & Trikalinos, T. A. (2013). Modernizing the systematic review process to inform comparative effectiveness: Tools and methods. Journal of Comparative Effectiveness Research, 2(3), 273–282.CrossRefGoogle Scholar
  54. Wang, J. (2011). Accuracy, agreement, speed, and perceived difficulty of users’ relevance judgments for e-discovery. In Proceedings of SIGIR information retrieval for e-discovery workshop (Vol. 1).Google Scholar
  55. Wang, J., & Soergel, D. (2010). A user study of relevance judgments for e-discovery. In Proceedings of the 73rd ASIS&T annual meeting on navigating streams in an information ecosystem (Vol. 47, p. 74). American Society for Information Science.Google Scholar
  56. Yu, Z., Kraft, N. A., & Menzies, T. (2016). How to read less: Better machine assisted reading methods for systematic literature reviews. arXiv preprint arXiv:161203224.
  57. Zhang, H., Abualsaud, M., Ghelani, N., Ghosh, A., Smucker, M. D., Cormack, G. V., & Grossman, M. R. (2017). UwaterlooMDS at the TREC 2017 common core track. In TREC.Google Scholar
  58. Zhang, H., Abualsaud, M., Ghelani, N., Smucker, M. D., Cormack, G. V., & Grossman, M. R. (2018). Effective user interaction for high-recall retrieval: Less is more. In Proceedings of the 27th ACM international conference on information and knowledge management, CIKM ’18 (pp. 187–196).  https://doi.org/10.1145/3269206.3271796.
  59. Zhang, H., Lin, J., Cormack, G. V., & Smucker, M. D. (2016). Sampling strategies and active learning for volume estimation. In Proceedings of the 39th International ACM SIGIR conference on research and development in information retrieval (pp. 981–984). ACM.Google Scholar
  60. Zhang, H., Lin, W., Wang, Y., Clarke, C. L. A., & Smucker, M. D. (2015). Waterlooclarke: TREC 2015 total recall track. In Proceedings of The twenty-fourth text REtrieval conference, TREC 2015, Gaithersburg, Maryland, USA, November 17–20, 2015.Google Scholar

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  1. 1.David R. Cheriton School of Computer ScienceUniversity of WaterlooWaterlooCanada
  2. 2.Department of Management SciencesUniversity of WaterlooWaterlooCanada

Personalised recommendations