Skip to main content

Exceptional Pattern Discovery

  • Chapter
  • First Online:
Discriminative Pattern Discovery on Biological Networks

Abstract

This chapter is devoted to a discussion on exceptional pattern discovery, namely on scenarios, contexts, and techniques concerning the mining of patterns which are so rare or so frequent to be considered as exceptional and, then, of interest for an expert to shed lights on the domain. Frequent patterns have found broad applications in areas like association rule mining, indexing, and clustering [1, 20, 23]. The application of frequent patterns in classification also achieved some success in the classification of relational data [6, 13, 14, 19, 25], text [15], and graphs [7]. The part is organized as follows. First, the frequent pattern mining on classical datasets is presented. This is not directly related with the content of the present work, which is mainly oriented in finding discriminating patterns, but they represent the starting point. Subsequently, Sect. 3.2 describes scenarios where patterns are exploited to discriminate between populations. Sections 3.3 and 3.4 illustrate how to mine patterns on networks and on biological data, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB’94, pp. 487–499. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1994). http://dl.acm.org/citation.cfm?id=645920.672836

  2. Ahn, Y., Bagrow, J., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466, 761–764 (2010)

    Article  Google Scholar 

  3. Angiulli, F., Fassetti, F., Manco, G., Palopoli, L.: Outlying property detection with numerical attributes. Data Min. Knowl. Discov. 31(1), 134–163 (2017)

    Article  MathSciNet  Google Scholar 

  4. Angiulli, F., Fassetti, F., Palopoli, L.: Un metodo per la scoperta di proprietà inattese. In: SEBD, pp. 321–328 (2006)

    Google Scholar 

  5. Angiulli, F., Fassetti, F., Palopoli, L.: Detecting outlying properties of exceptional objects. ACM Trans. Database Syst. 34(1) (2009)

    Google Scholar 

  6. Angiulli, F., Fassetti, F., Palopoli, L.: Discovering characterizations of the behavior of anomalous subpopulations. IEEE Trans. Knowl. Data Eng. 25(6), 1280–1292 (2013)

    Article  Google Scholar 

  7. Atias, N., Sharan, R.: Comparative analysis of protein networks: hard problems, practical solutions. Commun. ACM 55(5), 88–97 (2012)

    Article  Google Scholar 

  8. Bailey, J., Manoukian, T., Ramamohanarao, K.: Fast algorithms for mining emerging patterns. In: Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD), pp. 39–50. Springer-Verlag, London, UK (2002)

    Google Scholar 

  9. Bailey, J., Manoukian, T., Ramamohanarao, K.: Classification using constrained emerging patterns. In: Advances in Web-Age Information Management, pp. 226–237. Springer-Verlag (2003)

    Google Scholar 

  10. Barabasi, A.L., Gulbahce, N., Loscalzo, J.: Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12(1), 56–68 (2011)

    Article  Google Scholar 

  11. Bay, S.D., Pazzani, M.J.: Detecting group differences: mining contrast sets. Data Min. Knowl. Discov. 5(3), 213–246 (2001)

    Article  MATH  Google Scholar 

  12. Chen, J.C., Alvarez, M.J., Talos, F., et al.: Identification of causal genetic drivers of human disease through systems-level analysis of regulatory networks. Cell 159(2), 402–414 (2014)

    Article  Google Scholar 

  13. Ferraro, N., et al.: Asymmetric comparison and querying of biological networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 876–889 (2011)

    Article  Google Scholar 

  14. Georgii, E., et al.: Enumeration of condition-dependent dense modules in protein interaction networks. Bioinformatics 25(7), 933–940 (2009)

    Article  Google Scholar 

  15. Goh, K.I., Cusick, M.E., Valle, D., Childs, B., Vidal, M., Barabasi, A.L.: The human disease network. Proc. Natl. Acad. Sci. USA 104(21), 8685–8690 (2007)

    Article  Google Scholar 

  16. Gouda, K., Zaki, M.J.: Genmax: An efficient algorithm for mining maximal frequent itemsets. Data Min. Knowl. Discov. 11(3), 223–242 (2005). doi:10.1007/s10618-005-0002-x

    Article  MathSciNet  Google Scholar 

  17. Huan, J., Wang, W., Prins, J., Yang, J.: Spin: Mining maximal frequent subgraphs from graph databases. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 581–586 (2004)

    Google Scholar 

  18. Jiang, C., Coenen, F., Zito, M.: Frequent sub-graph mining on edge weighted graphs. In: Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (DAWAK), pp. 77–88 (2010)

    Google Scholar 

  19. Jiang, P., Singh, M.: SPICi: a fast clustering algorithm for large biological networks. Bioinformatics 26(8), 1105–1111 (2010)

    Article  Google Scholar 

  20. Klösgen, W.: Explora: A multipattern and multistrategy discovery assistant. In: Advances in Knowledge Discovery and Data Mining (KDD), pp. 249–271 (1996)

    Google Scholar 

  21. Koyuturk, M., Grama, A., Szpankowski, W.: An efficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics 20(1), 200–207 (2004)

    Article  Google Scholar 

  22. Koyutürk, M., Kim, Y., Subramaniam, S., Szpankowski, W., Grama, A.: Detecting conserved interaction patterns in biological networks. J. Comput. Biol. 13(7), 1299–1322 (2006)

    Article  MathSciNet  Google Scholar 

  23. Li, J., Dong, G., Ramamohanarao, K.: Making use of the most expressive jumping emerging patterns for classification. Knowl. Inf. Syst. 3(2), 1–29 (2001)

    MATH  Google Scholar 

  24. Li, J., Wong, L.: Emerging patterns and gene expression data. Genome Inf. 12, 3–13 (2001)

    Google Scholar 

  25. Liu, B., Hsu, W., Ma, Y.: Discovering the set of fundamental rule changes. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 335–340 (2001)

    Google Scholar 

  26. Liu, X., Wu, J., Gu, F., Wang, J., He, Z.: Discriminative pattern mining and its applications in bioinformatics. Brief. Bioinform. 16(5), 884–900 (2015)

    Article  Google Scholar 

  27. Milo, R., et al.: Network motifs: Simple building blocks of complex networks. Science 298(5594), 824–827 (2002)

    Article  Google Scholar 

  28. Novak, P.K., Lavrac, N., Webb, G.I.: Supervised descriptive rule discovery: a unifying survey of contrast set, emerging pattern and subgroup mining. J. Mach. Learn. Res. 10, 377–403 (2009)

    MATH  Google Scholar 

  29. Panni, S., Rombo, S.E.: Searching for repetitions in biological networks: methods, resources and tools. Brief. Bioinform. 16(1), 118–136 (2015)

    Article  Google Scholar 

  30. Pizzuti, C., Rombo, S.E.: Algorithms and tools for protein-protein interaction networks clustering, with a special focus on population-based stochastic methods. Bioinformatics 30(10), 1343–1352 (2014)

    Article  Google Scholar 

  31. Pizzuti, C., Rombo, S.E., Marchiori, E.: Complex detection in protein-protein interaction networks: a compact overview for researchers and practitioners. In: 10th European Conference of Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics (EvoBio), pp. 211–223 (2012)

    Google Scholar 

  32. Ramamohanarao, K., Bailey, J., Fan, H.: Efficient mining of contrast patterns and their applications to classification. In: Proceedings of the International Conference on Intelligent Sensing and Information Processing (ICISIP), pp. 39–47 (2005)

    Google Scholar 

  33. Ranu, S., Singh, A.K.: Graphsig: a scalable approach to mining significant subgraphs in large graph databases. In: Proceedings of the IEEE International Conference on Data Engineering, pp. 844–855 (2009)

    Google Scholar 

  34. Shao, Z., Hirayama, Y., Yamanishi, Y., Saigo, H.: Mining discriminative patterns from graph data with multiple labels and its application to quantitative structure-activity relationship (QSAR) models. J. Chem. Inf. Model. 55(12), 2519–2527 (2015)

    Article  Google Scholar 

  35. Singh, R., Xu, J., Berger, B.: Isorank: global alignment of multiple protein interaction networks with applications to functional orthology detection. Proc. Natl. Acad. Sci. 105(35), 12763–12768 (2008)

    Article  Google Scholar 

  36. Ting, R.M.H., Bailey, J.: Mining minimal contrast subgraph patterns. In: SIAM International Conference on Data Mining (SDM) (2006)

    Google Scholar 

  37. Vidal, M., Cusick, M.E., Barabasi, A.L.: Interactome networks and human disease. Cell 144(6), 986–998 (2011)

    Article  Google Scholar 

  38. Wang, Z., Zhao, Y., Wang, G., Li, Y., Wang, X.: On extending extreme learning machine to non-redundant synergy pattern based graph classification. Neurocomputing 149, Part A(0), 330–339 (2015)

    Google Scholar 

  39. Yan, X., Cheng, H., Han, J., Yu, P.S.: Mining significant graph patterns by leap search. In: ACM SIGMOD International Conference on Management of data, pp. 433–444. ACM (2008)

    Google Scholar 

  40. Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), pp. 721–724 (2002)

    Google Scholar 

  41. Zaki, M.J., Hsiao, C.J.: Charm: An efficient algorithm for closed itemset mining. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp. 457–473 (2002)

    Google Scholar 

  42. Zeng, Z., Wang, J., Zhou, L.: Efficient mining of minimal distinguishing subgraph patterns from graph databases. In: Advances in Knowledge Discovery and Data Mining, pp. 1062–1068 (2008)

    Google Scholar 

  43. Zhang, X., Dong, G., Kotagiri, R.: Exploring constraints to efficiently mine emerging patterns from large high-dimensional datasets. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’00, pp. 310–314. ACM, New York, NY, USA (2000). doi:10.1145/347090.347158

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fabio Fassetti .

Rights and permissions

Reprints and permissions

Copyright information

© 2017 The Author(s)

About this chapter

Cite this chapter

Fassetti, F., Rombo, S.E., Serrao, C. (2017). Exceptional Pattern Discovery. In: Discriminative Pattern Discovery on Biological Networks. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-63477-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63477-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63476-0

  • Online ISBN: 978-3-319-63477-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics