Skip to main content

A Comparison of Evaluation Metrics for Document Filtering

  • Conference paper
Multilingual and Multimodal Information Access Evaluation (CLEF 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6941))

Abstract

Although document filtering is simple to define, there is a wide range of different evaluation measures that have been proposed in the literature, all of which have been subject to criticism. We present a unified, comparative view of the strenghts and weaknesses of proposed measures based on two formal constraints (which should be satisfied by any suitable evaluation measure) and various properties (which help differentiating measures according to their behaviour). We conclude that (i) some smoothing process is necessary process to satisfy the basic constraints; and (ii) metrics can be grouped into three families, each satisfying one out of three formal properties, which are mutually exclusive, i.e. no metric can satisfy all three properties simultaneously.

This research was partially supported by the Spanish Ministry of Science and Innovation (Holopedia Project, TIN2010-21128-C02) and the Regional Government of Madrid and the European Social Fund under MA2VICMR (S2009/TIC-1542) .

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amigó, E., Artiles, J., Gonzalo, J., Spina, D., Liu, B., Corujo, A.: WePS3 Evaluation Campaign: Overview of the On-line Reputation Management Task. In: 2nd Web People Search Evaluation Workshop (WePS 2010), CLEF 2010 Conference, Padova Italy (2010)

    Google Scholar 

  2. Androutsopoulos, I., Koutsias, J., Chandrinos, K., Paliouras, G., Spyropoulos, C.D.: An evaluation of naive bayesian anti-spam filtering. CoRR cs.CL/0006013 (2000)

    Google Scholar 

  3. Callan, J.: Document filtering with inference networks. In: Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 262–269 (1996)

    Google Scholar 

  4. Cohen, J.: A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement 20(1), 37 (1960)

    Article  Google Scholar 

  5. Cormack, G., Lynam, T.: Trec 2005 spam track overview. In: Proceedings of the fourteenth Text Retrieval Conference 8TREC 2005 (2005)

    Google Scholar 

  6. Cunningham, P., Nowlan, N., Delany, S.J., Haahr, M.: A case-based approach to spam filtering that can track concept drift. In: The ICCBR 2003 Workshop on Long-Lived CBR Systems, pp. 03–2003 (2003)

    Google Scholar 

  7. Fawcett, T., Niculescu-Mizil, A.: Pav and the roc convex hull. Mach. Learn. 68, 97–106 (2007)

    Article  Google Scholar 

  8. Good, I.J.: ational decisions. Journal of the Royal Statistical Society. Series B Methodological 14, 107–114 (1952)

    MathSciNet  Google Scholar 

  9. Hedin, B., Tomlinson, S., Baron, J.R., Oard, D.W.: Overview of the trec 2009 legal track (2009)

    Google Scholar 

  10. Hoashi, K., Matsumoto, K., Inoue, N., Hashimoto, K.: Document filtering method using non-relevant information profile. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2000, pp. 176–183. ACM, New York (2000), http://doi.acm.org/10.1145/345508.345573

    Google Scholar 

  11. Hull, D.A.: The trec-6 filtering track: Description and analysis. In: Proceedings of the TREC 6, pp. 33–56 (1997)

    Google Scholar 

  12. Hull, D.A.: The TREC-7 filtering track: description and analysis. In: Voorhees, E.M., Harman, D.K. (eds.) Proceedings of TREC-7, 7th Text Retrieval Conference, pp. 33–56. National Institute of Standards and Technology, Gaithersburg (1998), citeseer.ist.psu.edu/126480.html

  13. Karon, B.P., Alexander, I.E.: Association and estimation in contingency tables. Journal of the American Statistical Association 23(2), 1–28 (1958), http://www.jstor.org/stable/2283825

    MathSciNet  Google Scholar 

  14. Ling, C.X., Huang, J., Zhang, H.: Auc: a statistically consistent and more discriminating measure than accuracy. In: IJCAI, pp. 519–526 (2003)

    Google Scholar 

  15. Mitchell, T.M.: Machine learning. McGraw Hill, New York (1997)

    MATH  Google Scholar 

  16. Persin, M.: Document filtering for fast ranking. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1994, pp. 339–348. Springer, New York (1994), http://portal.acm.org/citation.cfm?id=188490.188597

    Google Scholar 

  17. Provost, F.J., Fawcett, T.: Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In: Knowledge Discovery and Data Mining, pp. 43–48 (1997)

    Google Scholar 

  18. Qi, H., Yang, M., He, X., Li, S.: Re-examination on lam% in spam filtering. In: Proceedings of the SIGIR 2010 Conference, Geneva, Switzerland (2010)

    Google Scholar 

  19. Robertson, S., Hull, D.A.: The trec-9 filtering track final report. In: Proceedings of TREC-9, pp. 25–40 (2001)

    Google Scholar 

  20. Schapire, R.E., Singer, Y., Singhal, A.: Boosting and rocchio applied to text filtering. In: Proceedings of ACM SIGIR, pp. 215–223. ACM Press, New York (1998)

    Google Scholar 

  21. Sokolova, M.V., Japkowicz, N., Szpakowicz, S.: Beyond accuracy, F-score and ROC: A family of discriminant measures for performance evaluation. In: Sattar, A., Kang, B.-h. (eds.) AI 2006. LNCS (LNAI), vol. 4304, pp. 1015–1021. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  22. Van Rijsbergen, C.: Foundation of evaluation. Journal of Documentation 30(4), 365–373 (1974)

    Article  Google Scholar 

  23. Wei, C.P., Chen, H.C., Cheng, T.H.: Effective spam filtering: A single-class learning and ensemble approach. Decis. Support Syst. 45(3), 491–503 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Amigó, E., Gonzalo, J., Verdejo, F. (2011). A Comparison of Evaluation Metrics for Document Filtering. In: Forner, P., Gonzalo, J., Kekäläinen, J., Lalmas, M., de Rijke, M. (eds) Multilingual and Multimodal Information Access Evaluation. CLEF 2011. Lecture Notes in Computer Science, vol 6941. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23708-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23708-9_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23707-2

  • Online ISBN: 978-3-642-23708-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics