Skip to main content

Application of Graph Clustering and Visualisation Methods to Analysis of Biomolecular Data

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 838))

Abstract

In this paper we present an approach based on integrated use of graph clustering and visualisation methods for semi-supervised discovery of biologically significant features in biomolecular data sets. We describe several clustering algorithms that have been custom designed for analysis of biomolecular data and feature an iterated two step approach involving initial computation of thresholds and other parameters used in clustering algorithms, which is followed by identification of connected graph components, and, if needed, by adjustment of clustering parameters for processing of individual subgraphs.

We demonstrate the applications of these algorithms to two concrete use cases: (1) analysis of protein coexpression in colorectal cancer cell lines; and (2) protein homology identification from, both sequence and structural similarity, data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Boccaletti, S., et al.: The structure and dynamics of multilayer networks. Phys. Rep. 544, 1–122 (2014)

    Article  MathSciNet  Google Scholar 

  2. Choudhari, J., et al.: Genomic determinants of protein abundance variation in colorectal cancer cells. Cell Rep. 20, 2201–2214 (2017)

    Article  Google Scholar 

  3. Enright, A., et al.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584 (2002)

    Article  Google Scholar 

  4. Fortunato, A.: Community detection in graphs. Phys. Rep. 486, 75–174 (2010)

    Article  MathSciNet  Google Scholar 

  5. Freivalds, K., Dogrusoz, U., Kikusts, P.: Disconnected graph layout and the polyomino packing approach. In: Mutzel, P., Jünger, M., Leipert, S. (eds.) GD 2001. LNCS, vol. 2265, pp. 378–391. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45848-4_30

    Chapter  MATH  Google Scholar 

  6. Freivalds, K., Glagoļevs, J.: Graph compact orthogonal layout algorithm. In: Fouilhoux, P., Gouveia, L.E.N., Mahjoub, A.R., Paschos, V.T. (eds.) ISCO 2014. LNCS, vol. 8596, pp. 255–266. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-09174-7_22

    Chapter  Google Scholar 

  7. Grishin, N.: Fold change in evolution of protein structures. Struct. Biol. 134, 167–185 (2001)

    Article  Google Scholar 

  8. Higgins, D., Sievers, F.: Clustal Omega, accurate alignment of very large numbers of sequences. Methods Mol. Biol. 1079, 105–116 (2014)

    Article  Google Scholar 

  9. Higgins, D., et al.: ClustalW and ClustalX version 2.0. Bioinformatics 23, 2947–2948 (2007)

    Article  Google Scholar 

  10. Jonsson, P., et al.: Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis. BMC Bioinform. 7(1), 2 (2006)

    Article  Google Scholar 

  11. Kurbatova, N., Mancinska, L., Viksna, J.: Protein structure comparison based on fold evolution. Lect. Notes Inform. 115, 78–89 (2007)

    Google Scholar 

  12. Kurbatova, N., Viksna, J.: Exploration of evolutionary relations between protein structures. Commun. Comput. Inf. Sci. 13, 154–166 (2008)

    Google Scholar 

  13. Langfelder, P., Horwath, S.: WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 9, 559 (2008)

    Article  Google Scholar 

  14. Maddi, A., Eslahchi, C.: Discovering overlapped protein complexes from weighted PPI networks by removing inter-module hubs. Sci. Rep. 7, 3247 (2017)

    Article  Google Scholar 

  15. Nepusz, T., Yu, H., Paccanaro, A.: Detecting overlapping protein complexes in protein-protein interaction networks. Nat. Methods 9, 471–472 (2012)

    Article  Google Scholar 

  16. Orengo, C., et al.: New functional families in CATH to improve the mapping of conserved functional sites to 3D structures. Nucleic Acids Res. 44, 490–498 (2013)

    Google Scholar 

  17. Pearson, R.: Effective protein sequence comparison. Methods Enzymol. 266, 227–258 (1996)

    Article  Google Scholar 

  18. Petryszak, R., et al.: Expression Atlas update - an integrated database of gene and protein expression in humans, animals and plants. Nucleic Acids Res. 44(D1), 746–752 (2016)

    Article  Google Scholar 

  19. Pirim, H., Eksioglu, B., Perkins, A.: Clustering high throughput biological data with B-MST, a minimum spanning tree based heuristic. Comput. Biol. Med. 62, 94–102 (2015)

    Article  Google Scholar 

  20. Rung, J., Schlitt, T., Brazma, A., Freivalds, K., Vilo, J.: Building and analysing genome-wide gene disruption networks. Bioinformatics 18, S202–S210 (2002)

    Article  Google Scholar 

  21. Schaeffer, S.: Graph clustering. Comput. Sci. Rev. 1, 27–64 (2007)

    Article  Google Scholar 

  22. Smith, T., Waterman, M.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)

    Article  Google Scholar 

  23. Traag, A., Doreian, P., Mrvar, A.: Partitioning signed networks. ArXiv e-prints abs/1803.02082 (2018)

  24. van Dongen, S., Abreu-Goodger, C.: Using MCL to extract clusters from networks. In: van Helden, J., Toussaint, A., Thieffry, D. (eds.) Bacterial Molecular Networks. Methods in Molecular Biology (Methods and Protocols), vol. 804, pp. 281–295. Springer, New York (2012). https://doi.org/10.1007/978-1-61779-361-5_15

    Chapter  Google Scholar 

  25. Vihrovs, J., Prusis, K., Freivalds, K., Rucevskis, P., Krebs, V.: A potential field function for overlapping point set and graph cluster visualization. Commun. Comput. Inf. Sci. 550, 136–152 (2015)

    Google Scholar 

  26. Viksna, J., Gilbert, D.: Assessment of the probabilities for evolutionary structural changes in protein folds. Bioinformatics 23, 832–841 (2007)

    Article  Google Scholar 

Download references

Acknowledgements

The research was supported by ERDF project 1.1.1.1/16/A/135.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juris Vīksna .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Celms, E. et al. (2018). Application of Graph Clustering and Visualisation Methods to Analysis of Biomolecular Data. In: Lupeikiene, A., Vasilecas, O., Dzemyda, G. (eds) Databases and Information Systems. DB&IS 2018. Communications in Computer and Information Science, vol 838. Springer, Cham. https://doi.org/10.1007/978-3-319-97571-9_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-97571-9_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-97570-2

  • Online ISBN: 978-3-319-97571-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics