Discovering Collective Group Relationships

  • S. M. Masud Karim
  • Lin Liu
  • Jiuyong Li
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8506)


In many real-world situations, individual components of complex systems tend to form groups to interact collectively. The grouping effectuates collective relationships. On the other hand, collective relationshsips stimulate individual components to form groups. To gain clear understanding of the structure and functioning of these systems, it is necessary to identify both group formation and collective relationships at the same time. In this paper, we define the notation of collective group relationships (CGRs) between two sets of individual components and propose a method to discover CGRs from heterogeneous datasets. The method integrates canonical correlation analysis (CCA) with graph mining to find top-k CGRs. Several experimental studies are conducted on both synthetic and real-world datasets to demonstrate the effectiveness and efficiency of the proposed method.


Collective group relationships group pair canonical correlations quasi-cliques 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abello, J., Resende, M.G.C., Sudarsky, S.: Massive Quasi-Clique Detection. In: Rajsbaum, S. (ed.) LATIN 2002. LNCS, vol. 2286, pp. 598–612. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Cao, K.A.L., Martin, P.G.P., Granié, C.R., Besse, P.: Sparse canonical methods for biological data integration: Application to a cross-platform study. BMC Bioinformatics 10, 34 (2009)CrossRefGoogle Scholar
  3. 3.
    Chen, X., Liu, H.: An efficient optimization algorithm for structured sparse CCA, with applications to eQTL Mapping. Statistics in Biosciences 4(1), 3–26 (2012)CrossRefGoogle Scholar
  4. 4.
    Chen, J., Bushman, F.D., Lewis, J.D., Wu, G.D., Li, H.: Structure-constrained sparse canonical correlation analysis with an application to microbiome data analysis. Biostatistics 14(2), 244–258 (2013)CrossRefGoogle Scholar
  5. 5.
    Chiu, G.S., Westveld, A.H.: A unifying approach for food webs, phylogeny, social networks, and statistics. PNAS 108(38), 15881–15886 (2011)CrossRefGoogle Scholar
  6. 6.
    Danon, L., Díaz-Guilera, A., Duch, J., Arenas, A.: Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment P09008 (2005)Google Scholar
  7. 7.
    Fortunato, S.: Community detection in graphs. Physics Reports 486, 75–174 (2010)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Hotelling, H.: Relations Between Two Sets of Variates. Biometrika 28(3/4), 321–377 (1936)CrossRefzbMATHGoogle Scholar
  9. 9.
    Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recognition Letters 31(8), 651–666 (2010)CrossRefGoogle Scholar
  10. 10.
    Lee, W., Lee, D., Lee, Y., Pawitan, Y.: Sparse Canonical Covariance Analysis for High-throughput Data. Statistical Applications in Genetics and Molecular Biology 10(1): Article 30 (2011)Google Scholar
  11. 11.
    Lin, D., Zhang, J., Li, J., Calhoun, V.D., Deng, H.W., Wang, Y.P.: Group sparse canonical correlation analysis for genomic data integration. BMC Bioinformatics 14, 245 (2013)CrossRefGoogle Scholar
  12. 12.
    Liu, G., Wong, L.: Effective Pruning Techniques for Mining Quasi-Cliques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 33–49. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  13. 13.
    Liu, H., Li, J., Liu, L., Liu, J., Lee, I., Zhao, J.: Exploring Groups from Heterogeneous Data via Sparse Learning. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part I. LNCS (LNAI), vol. 7818, pp. 556–567. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  14. 14.
    Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Physical Review E 69(2), 26113 (2004)CrossRefGoogle Scholar
  15. 15.
    Parkhomenko, E., Tritchler, D., Beyene, J.: Sparse Canonical Correlation Analysis with Application to Genomic Data Integration. Statistical Applications in Genetics and Molecular Biology, 8(1), Article 1 (2009)Google Scholar
  16. 16.
    Søkilde, R., Kaczkowski, B., Podolska, A., Cirera, S., Gorodkin, J., Møller, S., Litman, T.: Global microRNA Analysis of the NCI-60 Cancer Cell Panel. Molecular Cancer Therapeutics 10, 375–384 (2011)CrossRefGoogle Scholar
  17. 17.
    Soneson, C., Lilljebjörn, H., Fioretos, T., Fontes, M.: Integrative analysis of gene expression and copy number alterations using canonical correlation analysis. BMC Bioinformatics 11, 191 (2010)CrossRefGoogle Scholar
  18. 18.
    Smyth, G.K.: Limma: linear models for microarray data. Statistics for Biology and Health. Bioinformatics and Computational Biology Solutions using R and Bioconductor. pp. 397-420. Springer (2005)Google Scholar
  19. 19.
    Tang, L., Liu, H., Zhang, J., Nazeri, Z.: Community Evolution in Dynamic Multi-Mode Networks. In: 14th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Las Vegas, USA, pp. 677–685 (2008)Google Scholar
  20. 20.
    Waaijenborg, S., Zwinderman, A.H.: Sparse canonical correlation analysis for identifying, connecting and completing gene-expression networks. BMC Bioinformatics 10, 315 (2009)CrossRefGoogle Scholar
  21. 21.
    Wagner, G.P., Pavlicev, M., Cheverud, J.M.: The Road to Modularity. Nature Reviews Genetics 8(12), 921–931 (2007)CrossRefGoogle Scholar
  22. 22.
    Witten, D., Tibshirani, R., Hastie, T.: A Penalized Matrix Decomposition, with Applications to Sparse Principal Components and Canonical Correlation Analysis. Biostatistics 10(3), 515–534 (2009)CrossRefGoogle Scholar
  23. 23.
    Yan, J.J., Zheng, W., Zhou, X., Zhao, Z.: Sparse 2-D canonical correlation analysis via low rank matrix approximation for feature extraction. IEEE Signal Process Letters 19(1), 51–54 (2012)CrossRefGoogle Scholar
  24. 24.
    Yeung, K.Y., Medvedovic, M., Bumgarner, R.E.: From co-expression to co-regulation: how many microarray experiments do we need? Genome Biology, 5(7), Article R48 (2004)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • S. M. Masud Karim
    • 1
  • Lin Liu
    • 1
  • Jiuyong Li
    • 1
  1. 1.School of Information Technology and Mathematical SciencesUniversity of South AustraliaMawson LakesAustralia

Personalised recommendations