Testing and Best Practices

  • Michael K. Bergman


When we process information to identify relations or extract entities, to type or classify them, or to fill out their attributes, we need to gauge how well our algorithms work. KM poses a couple of differences from traditional scientific hypothesis testing. The problems we are dealing with in information retrieval (IR), natural language understanding or processing (NLP), and machine learning (ML) are all statistical classification problems, specifically in binary classification. The most common scoring method to gauge the ‘accuracy’ of these classification problems uses statistical tests based on two metrics: negatives or positives, and true or false. We discuss a variety of statistical tests using the four possible results from these metrics (e.g., false positive). Testing scripts range from standard unit tests applied against platform tools to ones that do coherency and consistency checks across the knowledge structure or create reference standards for machine learning or inform improvements. We offer best practices learned from client deployments in areas such as data treatment and dataset management, creating and using knowledge structures, and testing, analysis, and documentation. Modularity in knowledge graphs, or consistent attention to UTF-8 encoding in data structures, or emphasis on ‘semi-automatic’ approaches, or use of literate programming and notebooks to record tests and procedures are just a few of the examples where lines blur between standard and best practices. Finding ways to identify and agree upon shared vocabularies and understandings is a central task of modeling the domain, and it involves practices in collaboration, naming, and use of these knowledge structures.


Testing Best practices Statistical tests Gold standards 


  1. 1.
    G. Hripcsak, A.S. Rothschild, Agreement, the F-measure, and reliability in information retrieval. J. Am. Med. Inform. Assoc. 12, 296–298 (2005)CrossRefGoogle Scholar
  2. 2.
    E. Miltsakaki, R. Prasad, A.K. Joshi, B.L. Webber, The Penn Discourse Treebank (2004)Google Scholar
  3. 3.
    P.V. Ogren, G.K. Savova, C.G. Chute, Constructing evaluation Corpora for automated clinical named entity recognition, in Medinfo 2007: Proceedings of the 12th World Congress on Health (Medical) Informatics; Building Sustainable Health Systems (IOS Press, Amsterdam, 2007), pp. 2325Google Scholar
  4. 4.
    V. Stoyanov, C. Cardie, Topic identification for fine-grained opinion analysis, in Proceedings of the 22nd International Conference on Computational Linguistics-Volume (2008), pp. 817–824Google Scholar
  5. 5.
    K. Dellschaft, S. Staab, On how to perform a gold standard based evaluation of ontology learning, in The Semantic Web-ISWC (Springer, Berlin, Heidelberg, 2006), pp. 228–241Google Scholar
  6. 6.
    KBART Phase II Working Group, KBART: Knowledge Bases and Related Tools Recommended Practice (NISO, Baltimore, MD, 2014)Google Scholar
  7. 7.
    M. Horridge, S. Jupp, G. Moulton, A. Rector, R. Stevens, C. Wroe, A Practical Guide to Building OWL Ontologies Using Protégé and CO-ODE Tools (University of Manchester, Manchester, 2007)Google Scholar
  8. 8.
    E.P.B. Simperl, C. Tempich, Ontology engineering: a reality check, in On the Move to Meaningful Internet Systems (Springer, New York, 2006), pp. 836–854CrossRefGoogle Scholar
  9. 9.
    F. Giasson, Exploding the Domain (Frederick Giasson, 2008)Google Scholar
  10. 10.
    D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J.-F. Crespo, D. Dennison, Hidden technical debt in machine learning systems, in Advances in Neural Information Processing Systems (2015), pp. 2503–2511Google Scholar
  11. 11.
    K. Jalan, How to Improve Machine Learning Performance? Lessons from Andrew Ng.
  12. 12.
    D. E. Knuth, Literate Programming. The Computer Journal 27(2), 97–111 (1984)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Michael K. Bergman
    • 1
  1. 1.Cognonto CorporationCoralvilleUSA

Personalised recommendations