Abstract
This chapter is devoted to a discussion on exceptional pattern discovery, namely on scenarios, contexts, and techniques concerning the mining of patterns which are so rare or so frequent to be considered as exceptional and, then, of interest for an expert to shed lights on the domain. Frequent patterns have found broad applications in areas like association rule mining, indexing, and clustering [1, 20, 23]. The application of frequent patterns in classification also achieved some success in the classification of relational data [6, 13, 14, 19, 25], text [15], and graphs [7]. The part is organized as follows. First, the frequent pattern mining on classical datasets is presented. This is not directly related with the content of the present work, which is mainly oriented in finding discriminating patterns, but they represent the starting point. Subsequently, Sect. 3.2 describes scenarios where patterns are exploited to discriminate between populations. Sections 3.3 and 3.4 illustrate how to mine patterns on networks and on biological data, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB’94, pp. 487–499. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1994). http://dl.acm.org/citation.cfm?id=645920.672836
Ahn, Y., Bagrow, J., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466, 761–764 (2010)
Angiulli, F., Fassetti, F., Manco, G., Palopoli, L.: Outlying property detection with numerical attributes. Data Min. Knowl. Discov. 31(1), 134–163 (2017)
Angiulli, F., Fassetti, F., Palopoli, L.: Un metodo per la scoperta di proprietà inattese. In: SEBD, pp. 321–328 (2006)
Angiulli, F., Fassetti, F., Palopoli, L.: Detecting outlying properties of exceptional objects. ACM Trans. Database Syst. 34(1) (2009)
Angiulli, F., Fassetti, F., Palopoli, L.: Discovering characterizations of the behavior of anomalous subpopulations. IEEE Trans. Knowl. Data Eng. 25(6), 1280–1292 (2013)
Atias, N., Sharan, R.: Comparative analysis of protein networks: hard problems, practical solutions. Commun. ACM 55(5), 88–97 (2012)
Bailey, J., Manoukian, T., Ramamohanarao, K.: Fast algorithms for mining emerging patterns. In: Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pp. 39–50. Springer-Verlag, London, UK (2002)
Bailey, J., Manoukian, T., Ramamohanarao, K.: Classification using constrained emerging patterns. In: Advances in Web-Age Information Management, pp. 226–237. Springer-Verlag (2003)
Barabasi, A.L., Gulbahce, N., Loscalzo, J.: Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12(1), 56–68 (2011)
Bay, S.D., Pazzani, M.J.: Detecting group differences: mining contrast sets. Data Min. Knowl. Discov. 5(3), 213–246 (2001)
Chen, J.C., Alvarez, M.J., Talos, F., et al.: Identification of causal genetic drivers of human disease through systems-level analysis of regulatory networks. Cell 159(2), 402–414 (2014)
Ferraro, N., et al.: Asymmetric comparison and querying of biological networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 876–889 (2011)
Georgii, E., et al.: Enumeration of condition-dependent dense modules in protein interaction networks. Bioinformatics 25(7), 933–940 (2009)
Goh, K.I., Cusick, M.E., Valle, D., Childs, B., Vidal, M., Barabasi, A.L.: The human disease network. Proc. Natl. Acad. Sci. USA 104(21), 8685–8690 (2007)
Gouda, K., Zaki, M.J.: Genmax: An efficient algorithm for mining maximal frequent itemsets. Data Min. Knowl. Discov. 11(3), 223–242 (2005). doi:10.1007/s10618-005-0002-x
Huan, J., Wang, W., Prins, J., Yang, J.: Spin: Mining maximal frequent subgraphs from graph databases. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 581–586 (2004)
Jiang, C., Coenen, F., Zito, M.: Frequent sub-graph mining on edge weighted graphs. In: Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (DAWAK), pp. 77–88 (2010)
Jiang, P., Singh, M.: SPICi: a fast clustering algorithm for large biological networks. Bioinformatics 26(8), 1105–1111 (2010)
Klösgen, W.: Explora: A multipattern and multistrategy discovery assistant. In: Advances in Knowledge Discovery and Data Mining (KDD), pp. 249–271 (1996)
Koyuturk, M., Grama, A., Szpankowski, W.: An efficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics 20(1), 200–207 (2004)
Koyutürk, M., Kim, Y., Subramaniam, S., Szpankowski, W., Grama, A.: Detecting conserved interaction patterns in biological networks. J. Comput. Biol. 13(7), 1299–1322 (2006)
Li, J., Dong, G., Ramamohanarao, K.: Making use of the most expressive jumping emerging patterns for classification. Knowl. Inf. Syst. 3(2), 1–29 (2001)
Li, J., Wong, L.: Emerging patterns and gene expression data. Genome Inf. 12, 3–13 (2001)
Liu, B., Hsu, W., Ma, Y.: Discovering the set of fundamental rule changes. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 335–340 (2001)
Liu, X., Wu, J., Gu, F., Wang, J., He, Z.: Discriminative pattern mining and its applications in bioinformatics. Brief. Bioinform. 16(5), 884–900 (2015)
Milo, R., et al.: Network motifs: Simple building blocks of complex networks. Science 298(5594), 824–827 (2002)
Novak, P.K., Lavrac, N., Webb, G.I.: Supervised descriptive rule discovery: a unifying survey of contrast set, emerging pattern and subgroup mining. J. Mach. Learn. Res. 10, 377–403 (2009)
Panni, S., Rombo, S.E.: Searching for repetitions in biological networks: methods, resources and tools. Brief. Bioinform. 16(1), 118–136 (2015)
Pizzuti, C., Rombo, S.E.: Algorithms and tools for protein-protein interaction networks clustering, with a special focus on population-based stochastic methods. Bioinformatics 30(10), 1343–1352 (2014)
Pizzuti, C., Rombo, S.E., Marchiori, E.: Complex detection in protein-protein interaction networks: a compact overview for researchers and practitioners. In: 10th European Conference of Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics (EvoBio), pp. 211–223 (2012)
Ramamohanarao, K., Bailey, J., Fan, H.: Efficient mining of contrast patterns and their applications to classification. In: Proceedings of the International Conference on Intelligent Sensing and Information Processing (ICISIP), pp. 39–47 (2005)
Ranu, S., Singh, A.K.: Graphsig: a scalable approach to mining significant subgraphs in large graph databases. In: Proceedings of the IEEE International Conference on Data Engineering, pp. 844–855 (2009)
Shao, Z., Hirayama, Y., Yamanishi, Y., Saigo, H.: Mining discriminative patterns from graph data with multiple labels and its application to quantitative structure-activity relationship (QSAR) models. J. Chem. Inf. Model. 55(12), 2519–2527 (2015)
Singh, R., Xu, J., Berger, B.: Isorank: global alignment of multiple protein interaction networks with applications to functional orthology detection. Proc. Natl. Acad. Sci. 105(35), 12763–12768 (2008)
Ting, R.M.H., Bailey, J.: Mining minimal contrast subgraph patterns. In: SIAM International Conference on Data Mining (SDM) (2006)
Vidal, M., Cusick, M.E., Barabasi, A.L.: Interactome networks and human disease. Cell 144(6), 986–998 (2011)
Wang, Z., Zhao, Y., Wang, G., Li, Y., Wang, X.: On extending extreme learning machine to non-redundant synergy pattern based graph classification. Neurocomputing 149, Part A(0), 330–339 (2015)
Yan, X., Cheng, H., Han, J., Yu, P.S.: Mining significant graph patterns by leap search. In: ACM SIGMOD International Conference on Management of data, pp. 433–444. ACM (2008)
Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), pp. 721–724 (2002)
Zaki, M.J., Hsiao, C.J.: Charm: An efficient algorithm for closed itemset mining. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp. 457–473 (2002)
Zeng, Z., Wang, J., Zhou, L.: Efficient mining of minimal distinguishing subgraph patterns from graph databases. In: Advances in Knowledge Discovery and Data Mining, pp. 1062–1068 (2008)
Zhang, X., Dong, G., Kotagiri, R.: Exploring constraints to efficiently mine emerging patterns from large high-dimensional datasets. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’00, pp. 310–314. ACM, New York, NY, USA (2000). doi:10.1145/347090.347158
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2017 The Author(s)
About this chapter
Cite this chapter
Fassetti, F., Rombo, S.E., Serrao, C. (2017). Exceptional Pattern Discovery. In: Discriminative Pattern Discovery on Biological Networks. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-63477-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-63477-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63476-0
Online ISBN: 978-3-319-63477-7
eBook Packages: Computer ScienceComputer Science (R0)