Testing and Best Practices

Bergman, Michael K.

doi:10.1007/978-3-319-98092-8_14

Michael K. Bergman²

909 Accesses

Abstract

When we process information to identify relations or extract entities, to type or classify them, or to fill out their attributes, we need to gauge how well our algorithms work. KM poses a couple of differences from traditional scientific hypothesis testing. The problems we are dealing with in information retrieval (IR), natural language understanding or processing (NLP), and machine learning (ML) are all statistical classification problems, specifically in binary classification. The most common scoring method to gauge the ‘accuracy’ of these classification problems uses statistical tests based on two metrics: negatives or positives, and true or false. We discuss a variety of statistical tests using the four possible results from these metrics (e.g., false positive). Testing scripts range from standard unit tests applied against platform tools to ones that do coherency and consistency checks across the knowledge structure or create reference standards for machine learning or inform improvements. We offer best practices learned from client deployments in areas such as data treatment and dataset management, creating and using knowledge structures, and testing, analysis, and documentation. Modularity in knowledge graphs, or consistent attention to UTF-8 encoding in data structures, or emphasis on ‘semi-automatic’ approaches, or use of literate programming and notebooks to record tests and procedures are just a few of the examples where lines blur between standard and best practices. Finding ways to identify and agree upon shared vocabularies and understandings is a central task of modeling the domain, and it involves practices in collaboration, naming, and use of these knowledge structures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Some material in this chapter was drawn from the author’s prior articles at the AI3:::Adaptive Information blog: “Listening to the Enterprise: Total Open Solutions, Part 1” (May 2010); “Using Wikis as Pre-Packaged Knowledge Bases” (Jul 2010); “A Reference Guide to Ontology Best Practices” (Sep 2010); “The Conditional Costs of Free” (Feb 2012); “Why Clojure?” (Dec 2014); “A Primer on Knowledge Statistics” (May 2015); “Literate Programming for an Open World” (Jun 2016); “Gold Standards in Enterprise Knowledge Projects” (Jul 2016).
2.
The Open Semantic Framework wiki is a contributor to content in this chapter, particularly “NLP and Knowledge Statistics” (http://wiki.opensemanticframework.org/index.php/NLP_and_Knowledge_Statistics) and “Ontology Best Practices” (http://wiki.opensemanticframework.org/index.php/Ontology_Best_Practices).
3.
I refer here to statistical classification; clearly, language meanings are not binary but nuanced.
4.
See http://en.wikipedia.org/wiki/Type_I_and_type_II_errors.
5.
See http://en.wikipedia.org/wiki/Template:DiagnosticTesting_Diagram.
6.
A vocabulary of linking predicates would capture the variety and degrees to which individuals, instances, classes, and concepts are similar or related to objects in other datasets. This purpose is different than, say, voiD (Vocabulary of Interlinked Datasets), which has as its purpose in providing descriptive metadata about the nature of particular datasets.
7.
As another commentary on the importance of definitions, see http://ontologyblog.blogspot.com/2010/09/physician-decries-lack-of-definitions.html.
8.
The Protégé manual [7] is also a source of good tips, especially with regard to naming conventions and the use of the editor.
9.
See http://obofoundry.org/wiki/index.php/OBO_Foundry_Principles.

References

G. Hripcsak, A.S. Rothschild, Agreement, the F-measure, and reliability in information retrieval. J. Am. Med. Inform. Assoc. 12, 296–298 (2005)
Article Google Scholar
E. Miltsakaki, R. Prasad, A.K. Joshi, B.L. Webber, The Penn Discourse Treebank (2004)
Google Scholar
P.V. Ogren, G.K. Savova, C.G. Chute, Constructing evaluation Corpora for automated clinical named entity recognition, in Medinfo 2007: Proceedings of the 12th World Congress on Health (Medical) Informatics; Building Sustainable Health Systems (IOS Press, Amsterdam, 2007), pp. 2325
Google Scholar
V. Stoyanov, C. Cardie, Topic identification for fine-grained opinion analysis, in Proceedings of the 22nd International Conference on Computational Linguistics-Volume (2008), pp. 817–824
Google Scholar
K. Dellschaft, S. Staab, On how to perform a gold standard based evaluation of ontology learning, in The Semantic Web-ISWC (Springer, Berlin, Heidelberg, 2006), pp. 228–241
Google Scholar
KBART Phase II Working Group, KBART: Knowledge Bases and Related Tools Recommended Practice (NISO, Baltimore, MD, 2014)
Google Scholar
M. Horridge, S. Jupp, G. Moulton, A. Rector, R. Stevens, C. Wroe, A Practical Guide to Building OWL Ontologies Using Protégé and CO-ODE Tools (University of Manchester, Manchester, 2007)
Google Scholar
E.P.B. Simperl, C. Tempich, Ontology engineering: a reality check, in On the Move to Meaningful Internet Systems (Springer, New York, 2006), pp. 836–854
Chapter Google Scholar
F. Giasson, Exploding the Domain (Frederick Giasson, 2008)
Google Scholar
D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J.-F. Crespo, D. Dennison, Hidden technical debt in machine learning systems, in Advances in Neural Information Processing Systems (2015), pp. 2503–2511
Google Scholar
K. Jalan, How to Improve Machine Learning Performance? Lessons from Andrew Ng. https://www.kdnuggets.com/2017/12/improve-machine-learning-performance-lessons-andrew-ng.html
D. E. Knuth, Literate Programming. The Computer Journal 27(2), 97–111 (1984)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Cognonto Corporation, Coralville, IA, USA
Michael K. Bergman

Authors

Michael K. Bergman
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bergman, M.K. (2018). Testing and Best Practices. In: A Knowledge Representation Practionary. Springer, Cham. https://doi.org/10.1007/978-3-319-98092-8_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-98092-8_14
Published: 13 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98091-1
Online ISBN: 978-3-319-98092-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics