Abstract
Knowledge Discovery in Databases (KDD) and especially pattern mining can be interpreted along several dimensions, namely data, knowledge, problem-solving and interactivity. These dimensions are not disconnected and have a direct impact on the quality, applicability, and efficiency of KDD. Accordingly, we discuss some objectives of KDD based on these dimensions, namely exploration, knowledge orientation, hybridization, and explanation. The data space and the pattern space can be explored in several ways, depending on specific evaluation functions and heuristics, possibly related to domain knowledge. Furthermore, numerical data are complex and supervised numerical machine learning methods are usually the best candidates for efficiently mining such data. However, the work and output of numerical methods are most of the time hard to understand, while symbolic methods are usually more intelligible. This calls for hybridization, combining numerical and symbolic mining methods to improve the applicability and interpretability of KDD. Moreover, suitable explanations about the operating models and possible subsequent decisions should complete KDD, and this is far from being the case at the moment. For illustrating these dimensions and objectives, we analyze a concrete case about the mining of biological data, where we characterize these dimensions and their connections. We also discuss dimensions and objectives in the framework of Formal Concept Analysis and we draw some perspectives for future research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C., Han, J. (eds.): Frequent Pattern Mining. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07821-2
Alam, M., Buzmakov, A., Codocedo, V., Napoli, A.: Mining definitions from RDF annotations using formal concept analysis. In: Yang, Q., Wooldridge, M. (eds.) Proceedings of IJCAI, pp. 823–829. AAAI Press (2015)
Alam, M., Buzmakov, A., Napoli, A.: Exploratory knowledge discovery over web of data. Discret. Appl. Math. 249, 2–17 (2018)
Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P. (eds.): The Description Logic Handbook. Cambridge University Press, Cambridge (2003)
Belfodil, A., Belfodil, A., Kaytoue, M.: Anytime subgroup discovery in numerical domains with guarantees. In: Berlingerio, M., Bonchi, F., Gärtner, T., Hurley, N., Ifrim, G. (eds.) ECML PKDD 2018. LNCS (LNAI), vol. 11052, pp. 500–516. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-10928-8_30
Bendimerad, A.A., Plantevit, M., Robardet, C.: Mining exceptional closed patterns in attributed graphs. Knowl. Inf. Syst. 56(1), 1–25 (2018)
Bertet, K., Demko, C., Viaud, J.-F., Guérin, C.: Lattices, closures systems and implication bases: a survey of structural aspects and algorithms. Theor. Comput. Sci. 743, 93–109 (2018)
De Bie, T.: Subjective interestingness in exploratory data mining. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds.) IDA 2013. LNCS, vol. 8207, pp. 19–31. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41398-8_3
Blockeel, H.: Data mining: from procedural to declarative approaches. New Gener. Comput. 33(2), 115–135 (2015)
Brachman, R.J., Anand, T.: The process of knowledge discovery in databases. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 37–57. AAAI Press/MIT Press (1996)
Brazdil, P., Giraud-Carrier, C.G., Soares, C., Vilalta, R.: Metalearning - Applications to Data Mining. Cognitive Technologies. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-540-73263-1
Buzmakov, A., Kuznetsov, S.O., Napoli, A.: Scalable estimates of concept stability. In: Glodeanu, C.V., Kaytoue, M., Sacarea, C. (eds.) ICFCA 2014. LNCS (LNAI), vol. 8478, pp. 157–172. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07248-7_12
Buzmakov, A., Kuznetsov, S.O., Napoli, A.: Fast generation of best interval patterns for nonmonotonic constraints. In: Appice, A., Rodrigues, P.P., Santos Costa, V., Gama, J., Jorge, A., Soares, C. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9285, pp. 157–172. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23525-7_10
Carpineto, C., Romano, G.: Concept Data Analysis: Theory and Applications. Wiley, Chichester (2004)
Codocedo, V., Lykourentzou, I., Napoli, A.: A semantic approach to concept lattice-based information retrieval. Ann. Math. Artif. Intell. 72, 169–195 (2014)
Codocedo, V., Napoli, A.: Formal concept analysis and information retrieval – a survey. In: Baixeries, J., Sacarea, C., Ojeda-Aciego, M. (eds.) ICFCA 2015. LNCS (LNAI), vol. 9113, pp. 61–77. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19545-2_4
d’Avila Garcez, A.S., et al.: Neural-symbolic learning and reasoning: contributions and challenges. In: AAAI Spring Symposium (2015)
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1
Duivesteijn, W., Feelders, A., Knobbe, A.J.: Exceptional Model Mining - supervised descriptive local pattern mining with complex target concepts. Data Min. Knowl. Discov. 30(1), 47–98 (2016)
Duquenne, V.: Latticial structures in data analysis. Theor. Comput. Sci. 217, 407–436 (1999)
Eklund, P., Villerd, J.: A survey of hybrid representations of concept lattices in conceptual knowledge processing. In: Kwuida, L., Sertkaya, B. (eds.) ICFCA 2010. LNCS (LNAI), vol. 5986, pp. 296–311. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11928-6_21
Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)
Ganter, B., Wille, R.: Formal Concept Analysis - Mathematical Foundations. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-642-59830-2
Ganter, B., Kuznetsov, S.O.: Pattern structures and their projections. In: Delugach, H.S., Stumme, G. (eds.) ICCS-ConceptStruct 2001. LNCS (LNAI), vol. 2120, pp. 129–142. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44583-8_10
Ganter, B., Obiedkov, S.A.: Conceptual Exploration. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49291-8
Ganter, B., Stumme, G., Wille, R. (eds.): Formal Concept Analysis. LNCS (LNAI), vol. 3626. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31881-1
Grgic-Hlaca, N., Zafar, M.B., Gummadi, K.P., Weller, A.: Beyond distributive fairness in algorithmic decision making: feature selection for procedurally fair learning. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Proceedings of AAAI 2018, pp. 51–60. AAAI Press (2018)
Grissa, D., Comte, B., Pétéra, M., Pujos-Guillot, E., Napoli, A.: A hybrid and exploratory approach to knowledge discovery in metabolomic data. Discrete Applied Mathematics (2019, to be published)
Grissa, D., Comte, B., Pujos-Guillot, E., Napoli, A.: A hybrid knowledge discovery approach for mining predictive biomarkers in metabolomic data. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9851, pp. 572–587. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46128-1_36
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Gianotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5) (2018)
Hilario, M., Nguyen, P., Do, H., Woznica, A., Kalousis, A.: Ontology-based meta-mining of knowledge discovery workflows. In: Jankowski, N., Duch, W., Grabczewski, K. (eds.) Meta-Learning in Computational Intelligence, vol. 358, pp. 273–315. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20980-2_9
Holzinger, A., Dehmer, M., Jurisica, I.: Knowledge discovery and interactive data mining in bioinformatics - state-of-the-art, future challenges and research directions. BMC Bioinform. 15(S-6), I1 (2014)
Hristoskova, A., Boeva, V., Tsiporkova, E.: An integrative clustering approach combining particle swarm optimization and formal concept analysis. In: Böhm, C., Khuri, S., Lhotská, L., Renda, M.E. (eds.) ITBAM 2012. LNCS, vol. 7451, pp. 84–98. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32395-9_7
Janowicz, K., van Harmelen, F., Hendler, J.A., Hitzler, P.: Why the data train needs semantic rails. AI Mag. 36(1), 5–14 (2015)
Kaytoue, M., Codocedo, V., Baixeries, J., Napoli, A.: Three interrelated FCA methods for mining biclusters of similar values on columns. In: Bertet, K., Rudolph, S. (eds.) Proceedings of CLA. CEUR Workshop Proceedings, vol. 1252, pp. 243–254 (2014)
Kaytoue, M., Codocedo, V., Buzmakov, A., Baixeries, J., Kuznetsov, S.O., Napoli, A.: Pattern structures and concept lattices for data mining and knowledge processing. In: Bifet, A., et al. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9286, pp. 227–231. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23461-8_19
Kaytoue, M., Kuznetsov, S.O., Macko, J., Napoli, A.: Biclustering meets triadic concept analysis. Ann. Math. Artif. Intell. 70(1–2), 55–79 (2014)
Kaytoue, M., Kuznetsov, S.O., Napoli, A., Duplessis, S.: Mining gene expression data with pattern structures in formal concept analysis. Inf. Sci. 181(10), 1989–2001 (2011)
Kaytoue, M., Plantevit, M., Zimmermann, A., Bendimerad, A.A., Robardet, C.: Exceptional contextual subgraph mining. Mach. Learn. 106(8), 1171–1211 (2017)
Kuznetsov, S.O., Makhalova, T.P.: On interestingness measures of formal concepts. Inf. Sci. 442–443, 202–219 (2018)
Lavrac, N., Kavsek, B., Flach, P.A., Todorovski, L.: Subgroup discovery with CN2-SD. J. Mach. Learn. Res. 5, 153–188 (2004)
Makhalova, T.P., Kuznetsov, S.O., Napoli, A.: A first study on what MDL can do for FCA. In: Ignatov, D.I., Nourine, L. (eds.) Proceedings of CLA, CEUR Workshop Proceedings, vol. 2123, pp. 25–36 (2018)
Nguyen, P., Hilario, M., Kalousis, A.: Using meta-mining to support data mining workflow planning and optimization. J. Artif. Intell. Res. (JAIR) 51, 605–644 (2014)
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why Should I Trust You?”: explaining the predictions of any classifier. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal, C.C., Shen, D., Rastogi, R. (eds.) Proceedings of SIGKDD, pp. 1135–1144. ACM (2016)
Rouane-Hacene, M., Huchard, M., Napoli, A., Valtchev, P.: Relational concept analysis: mining concept lattices from multi-relational data. Ann. Math. Artif. Intell. 67(1), 81–108 (2013)
Sagi, O., Rokach, L.: Ensemble learning: a survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 8(4) (2018)
Sourek, G., Aschenbrenner, V., Zelezný, F., Schockaert, S., Kuzelka, O.: Lifted relational neural networks: efficient learning of latent relational structures. J. Artif. Intell. Res. 62, 140–151 (2018)
Tan, P.-N., Steinbach, M., Karpatne, A., Kumar, V.: Introduction to Data Mining, 2nd edn. Pearson, New York (2018)
Tran, S.N., d’Avila Garcez, A.S.: Deep logic networks: inserting and extracting knowledge from deep belief networks. IEEE Trans. Neural Netw. Learn. Syst. 29(2), 246–258 (2018)
Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley Publishing Company, Reading (1977)
Ugarte, W., et al.: Skypattern mining: from pattern condensed representations to dynamic constraint satisfaction problems. Artif. Intell. 244, 48–69 (2017)
Leeuwen, M.: Interactive data exploration using pattern mining. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. LNCS, vol. 8401, pp. 169–182. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43968-5_9
Vreeken, J., Tatti, N.: Interesting patterns. In: Aggarwal and Han [1], pp. 105–134
Yoneda, Y., Sugiyama, M., Washio, T.: Learning graph representation via formal concept analysis. CoRR, abs/1812.03395 (2018)
Zafar, M.B., Valera, I., Gomez-Rodriguez, M., Gummadi, K.P., Weller, A.: From parity to preference-based notions of fairness in classification. In: Guyon, I., et al. (eds.) Proceedings of NIPS, pp. 228–238 (2017)
Zaki, M.J., Meira Jr., W.: Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press, New York (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Couceiro, M., Napoli, A. (2019). Elements About Exploratory, Knowledge-Based, Hybrid, and Explainable Knowledge Discovery. In: Cristea, D., Le Ber, F., Sertkaya, B. (eds) Formal Concept Analysis. ICFCA 2019. Lecture Notes in Computer Science(), vol 11511. Springer, Cham. https://doi.org/10.1007/978-3-030-21462-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-21462-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21461-6
Online ISBN: 978-3-030-21462-3
eBook Packages: Computer ScienceComputer Science (R0)