Abstract
Acquisition of knowledge from data is the quintessential task of machine learning. The knowledge we extract this way might not be suitable for immediate use and one or more data postprocessing methods could be applied as well. Data postprocessing includes the integration, filtering, evaluation, and explanation of acquired knowledge. Nomograms, graphical devices for approximate calculations of functions, are a useful tool for visualising and comparing prediction models. It is well known that any generalised additive model can be represented by a quasi-nomogram – a nomogram where some summation performed by the human is required. Nomograms of this type are widely used, especially in medical prognostics. Methods for constructing such a nomogram were developed for specific types of prediction models thus assuming that the structure of the model is known. In this chapter we extend our previous work on a general method for explaining arbitrary prediction models (classification or regression) to a general methodology for constructing a quasi-nomogram for a black-box prediction model. We show that for an additive model, such a quasi-nomogram is equivalent to the one we would construct if the structure of the model was known.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
www.sciencedirect.com currently lists 1393 research papers that feature the word “nomogram” in the title, keywords, or abstract and were published between 2006 and 2015. Most of them are from the medical field.
- 2.
Linear regression is, of course, just a special case of generalised additive model with identity link function and linear effect functions
References
Achen, C.H.: Intepreting and Using Regression. Sage Publications (1982)
Baehrens, D., Schroeter, T., Harmeling, S., Kawanabe, M., Hansen, K., MÞller, K.R.: How to explain individual classification decisions. J. Mach. Learn. Res. 11, 1803–1831 (2010)
Bosnić, Z., Vračar, P., Radović, M.D., Devedzić, G., Filipović, N.D., Kononenko, I.: Mining data from hemodynamic simulations for generating prediction and explanation models. IEEE Trans. Inf. Technol. Biomed. 16(2), 248–254 (2012)
Breiman, L.: Random forests. Mach. Learn. J. 45, 5–32 (2001)
Cho, B.H., Yu, H., Lee, J., Chee, Y.J., Kim, I.Y., Kim, S.I.: Nonlinear support vector machine visualization for risk factor analysis using nomograms and localized radial basis function kernels. IEEE Trans. Inf. Technol. Biomed. 12(2), 247–256 (2008)
Chun, F.K.H., Briganti, A., Karakiewicz, P.I., Graefen, M.: Should we use nomograms to predict outcome?. Eur. Urol. Suppl. 7(5), 396–399 (2008). Update Uro-Oncology 2008, Fifth Fall Meeting of the European Society of Oncological Urology (ESOU)
Demšar, J., Zupan, B., Leban, G., Curk, T.: Orange: From experimental machine learning to interactive data mining. In: PKDD’04, pp. 537–539 (2004)
d’Ocagne, M.: Traité de nomographie. Gauthier-Villars, Paris (1899)
Doerfler, R.: The lost art of nomography. UMAP J. 30(4), 457–493 (2009)
Eastham, J.A., Scardino, P.T., Kattan, M.W.: Predicting an optimal outcome after radical prostatectomy: the trifecta nomogram. J. Urol. 79(6), 2011–2207 (2008)
Grömping, U.: Estimators of relative importance in linear regression based on variance decomposition. Am. Stat. 61(2), (2007)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Jaeckel, P.: Monte Carlo Methods in Finance. Wiley, New York (2002)
Jakulin, A., Možina, M., Demšar, J., Bratko, I., Zupan, B.: Nomograms for visualizing support vector machines. In: KDD ’05: Proceeding of the eleventh ACM SIGKDD International Conference on Knowledge Discovery In Data Mining, pp. 108–117. ACM, New York, USA (2005)
Kanao, K., Mizuno, R., Kikuchi, E., Miyajima, A., Nakagawa, K., Ohigashi, T., Nakashima, J., Oya, M.: Preoperative prognostic nomogram (probability table) for renal cell carcinoma based on tnm classification. J. Urol. 181(2), 480–485 (2009)
Kattan, M.W., Marasco, J.: What is a real nomogram. Semin. Oncol. 37(1), 23–26 (2010)
Kubatko, J., Oliver, D., Pelton, K., Rosenbaum, D.T.: A starting point for analyzing basketball statistics. J. Quantit. Anal. Sports 3(3), 00–01 (2007)
Kukar, M., Grošelj, C.: Supporting diagnostics of coronary artery disease with neural networks. In: Adaptive and Natural Computing Algorithms, pp. 80–89. Springer, Berlin (2011)
Kukar, M., Kononenko, I., Grošelj, C.: Modern parameterization and explanation techniques in diagnostic decision support system: a case study in diagnostics of coronary artery disease. Artif. Intell. Med. 52(2), 77–90 (2011)
Lee, K.M., Kim, W.J., Ryu, K.H., Lee, S.H.: A nomogram construction method using genetic algorithm and naive Bayesian technique. In: Proceedings of the 11th WSEAS International Conference on Mathematical and Computational Methods In Science And Engineering, pp. 145–149. World Scientific and Engineering Academy and Society (WSEAS), Stevens Point, Wisconsin, USA (2009)
Lemaire, V., Féraud, R., Voisine, N.: Contact personalization using a score understanding method. In: International Joint Conference on Neural Networks (IJCNN) (2008)
Lughofer, E., Richter, R., Neissl, U., Heidl, W., Eitzinger, C., Radauer, T.: Advanced linguistic explanations of classifier decisions for users’ annotation support. In: 2016 IEEE 8th International Conference on Intelligent Systems (IS), pp. 421–432. IEEE, New York (2016)
Možina, M., Demšar, J., Kattan, M., Zupan, B.: Nomograms for visualization of naive Bayesian classifier. In: PKDD ’04: Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 337–348. Springer, New York, USA (2004)
Nguyen, C.T., Stephenson, A.J., Kattan, M.W.: Are nomograms needed in the management of bladder cancer?. Urol. Oncol. Semin. Orig. Investig. 28(1), 102 – 107 (2010). Proceedings: Midwinter Meeting of the Society of Urologic Oncology (December 2008): Updated Issues in Kidney, Bladder, Prostate, and Testis Cancer
Niederreiter, H.: Low-discrepancy and low-dispersion sequences. J. Number Theory 30(1), 51–70 (1988)
Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA (1992)
Pregeljc, M., Štrumbelj, E., Mihelcic, M., Kononenko, I.: Learning and explaining the impact of enterprises organizational quality on their economic results. Intelligent Data Analysis for Real-Life Applications: Theory and Practice pp. 228–248 (2012)
Radović, M.D., Filipović, N.D., Bosnić, Z., Vračar, P., Kononenko, I.: Mining data from hemodynamic simulations for generating prediction and explanation models. In: 2010 10th IEEE International Conference on Information Technology and Applications in Biomedicine (ITAB), pp. 1–4. IEEE, New York (2010)
Robnik-Šikonja, M., Kononenko, I.: Explaining classifications for individual instances. IEEE Trans. Knowl. Data Eng. 20(5), 589–600 (2008)
Robnik-Šikonja, M., Kononenko, I., Štrumbelj, E.: Quality of classification explanations with prbf. Neurocomputing 96, 37–46 (2012)
Robnik-Šikonja, M., Likas, A., Constantinopoulos, C., Kononenko, I., Štrumbelj, E.: Efficiently explaining decisions of probabilistic RBF classification networks. In: Adaptive and Natural Computing Algorithms, pp. 169–179. Springer, Berlin (2011)
Robnik-Šikonja, M., Kononenko, I.: Explaining classifications for individual instances. IEEE TKDE 20, 589–600 (2008)
Shapley, L.S.: A Value for n-person games. Contributions to the Theory of Games, vol. II. Princeton University Press, Princeton (1953)
Štrumbelj, E., Bosnić, Z., Zakotnik, B., Grašič-Kuhar, C., Kononenko, I.: Explanation and reliability of breast cancer recurrence predictions. Knowl. Inf. Syst. 24(2), 305–324 (2010)
Štrumbelj, E., Kononenko, I.: An efficient explanation of individual classifications using game theory. J. Mach. Learn. Res. 11, 1–18 (2010)
Štrumbelj, E., Kononenko, I.: A general method for visualizing and explaining black-box regression models. In: Dobnikar A., Lotric U., Ster B. (eds.) ICANNGA (2). Lecture Notes in Computer Science, vol. 6594, pp. 21–30. Springer, Berlin (2011)
Štrumbelj, E., Kononenko, I.: Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41(3), 647–665 (2014)
Vien, N.A., Viet, N.H., Chung, T., Yu, H., Kim, S., Cho, B.H.: Vrifa: a nonlinear SVM visualization tool using nomogram and localized radial basis function (LRBF) kernels. In: CIKM, pp. 2081–2082 (2009)
Zien, A., Krämer, N., Sonnenburg, S., Rätsch, G.: The feature importance ranking measure. In: ECML PKDD 2009, Part II, pp. 694–709. Springer, Berlin (2009)
Zlotnik, A., Abraira, V.: A general-purpose nomogram generator for predictive logistic regression models. Stata J. 15(2), 537–546 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Štrumbelj, E., Kononenko, I. (2018). Explaining the Predictions of an Arbitrary Prediction Model: Feature Contributions and Quasi-nomograms. In: Zhou, J., Chen, F. (eds) Human and Machine Learning. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-319-90403-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-90403-0_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-90402-3
Online ISBN: 978-3-319-90403-0
eBook Packages: Computer ScienceComputer Science (R0)