Skip to main content

Functional Interpretation of Gene Sets: Semantic-Based Clustering of Gene Ontology Terms on the BioTest Platform

  • Conference paper
  • First Online:
Man-Machine Interactions 5 (ICMMI 2017)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 659))

Included in the following conference series:

Abstract

Modern high-throughput technologies based on genome, transcriptome or proteome profiling provide abundance of data that needs to be processed, analyzed and, finally, interpreted. Effective and efficient analysis of data coming from molecular profiling is crucial for a detailed diagnosis, prognosis, and prediction of therapy outcome. Meaningful conclusions can be drawn only by the use of sophisticated methods for biomedical and molecular data analysis and interpretation. In this study we present the approach for functional interpretation of gene or protein sets with clusters of Gene Ontology terms. We analyze transcription profiles of human cell line K562 and we show that clustering allows grouping functionally related GO terms and therefore obtaining more concise and comprehensive description. By applying cluster-specific data aggregation tool we are able to calculate statistics for the individual clusters of GO terms and compare the number of differentially expressed genes between two sample pairs. The presented tool is implemented as a part of annotation module available on the BioTest remote platform for hypothesis testing and analysis of biomedical data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Afgan, E., et al.: The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res. 44(W1), gkw343 (2016)

    Article  Google Scholar 

  2. Ashburner, M., et al.: Gene Ontology: tool for the unification of biology. Nat. Genet. 25(1), 25–29 (2000)

    Article  Google Scholar 

  3. Bensz, W., et al.: Integrated system supporting research on environment related cancers. In: Król, D., Madeyski, L., Nguyen, N.T. (eds.) Recent Developments in Intelligent Information and Database Systems, SCI, vol. 642, pp. 399–409. Springer, Cham (2016)

    Chapter  Google Scholar 

  4. Biggs, J.R., Kraft, A.S.: Myeloid cell differentiation. In: eLS. John Wiley and Sons Ltd., Hoboken (2001)

    Google Scholar 

  5. Birkland, A., Yona, G.: BIOZON: a system for unification, management and analysis of heterogeneous biological data. BMC Bioinf. 7, 70 (2006)

    Article  Google Scholar 

  6. Carmona-Saez, P., et al.: Integrated analysis of gene expression by association rules discovery. BMC Bioinf. 7(9), 54 (2006)

    Article  Google Scholar 

  7. Chow, M.T., Luster, A.D.: Chemokines in cancer. Cancer Immunol. Res. 2(12), 1125–1131 (2014)

    Article  Google Scholar 

  8. Dai, M., et al.: Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Res. 33(20), e175 (2005)

    Article  Google Scholar 

  9. Do, L.H., Esteves, F., Karten, H., Bier, E.: Booly: a new data integration platform. BMC Bioinf. 11, 513 (2010)

    Article  Google Scholar 

  10. Falcon, S., Gentleman, R.: Using GOstats to test gene lists for GO term association. Bioinformatics 23(2), 257–258 (2007)

    Article  Google Scholar 

  11. Fulda, S., Gorman, A.M., Hori, O., Samali, A.: Cellular stress responses: cell survival and cell death. Int. J. Cell Biol. 2010, 23 (2010). Article no. 214074

    Google Scholar 

  12. Gomez-Cabrero, D., et al.: Data integration in the era of omics: current and future challenges. BMC Syst. Biol. 8(Suppl 2), I1 (2014)

    Article  Google Scholar 

  13. Gruca, A., Kozielski, M., Sikora, M.: Fuzzy clustering and Gene Ontology based decision rules for identification and description of gene groups. In: Cyran, K.A., Kozielski, S., Peters, J.F., Stańczyk, U., Wakulicz-Deja, A. (eds.) Man-Machine Interactions, AINSC, vol. 59, pp. 141–149. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  14. Gruca, A., Sikora, M.: Data- and expert-driven rule induction and filtering framework for functional interpretation and description of gene sets. J. Biomed. Semant. 8(1), 23 (2017)

    Article  Google Scholar 

  15. Gruca, A., Sikora, M., Polanski, A.: RuleGO: a logical rules-based tool for description of gene groups by means of Gene Ontology. Nucleic Acids Res. 39(Web Server issue), W293–W301 (2011)

    Article  Google Scholar 

  16. Huang, D.W., et al.: DAVID Bioinformatics Resources: Expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res. 35(Web Server issue), W169–W175 (2007)

    Article  Google Scholar 

  17. Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: ROCLING X 1997, pp. 19–33, Taiwan (1997)

    Google Scholar 

  18. Kozielski, M., Gruca, A.: Soft approach to identification of cohesive clusters in two gene representations. Procedia Comput. Sci. 35(C), 281–289 (2014)

    Article  Google Scholar 

  19. Lan, C., Chen, Q., Li, J.: Grouping miRNAs of similar functions via weighted information content of Gene Ontology. BMC Bioinf. 17(19), 507 (2016)

    Article  Google Scholar 

  20. Lin, D.: An information-theoretic definition of similarity. In: ICML 1998, pp. 296–304 (1998)

    Google Scholar 

  21. Linger, J.G., Tyler, J.K.: Chromatin disassembly and reassembly during DNA repair. Mutat. Res. - Fundam. Mol. Mech. Mutagen. 618(1–2), 52–64 (2007)

    Article  Google Scholar 

  22. Maere, S., Heymans, K., Kuiper, M.: BiNGO: a cytoscape plugin to assess overrepresentation of Gene Ontology categories in biological networks. Bioinformatics 21(16), 3448–3449 (2005)

    Article  Google Scholar 

  23. Masseroli, M., Canakoglu, A., Ceri, S.: Integration and querying of genomic and proteomic semantic annotations for biomedical knowledge extraction. IEEE/ACM Trans. Comput. Biol. Bioinf. 13(2), 209–219 (2016)

    Article  Google Scholar 

  24. Ovaska, K., Laakso, M., Hautaniemi, S.: Fast Gene Ontology based clustering for microarray experiments. BioData Min. 1(1), 11 (2008)

    Article  Google Scholar 

  25. Pesquita, C., et al.: Semantic similarity in biomedical ontologies. PLoS Comput. Biol. 5(7), e1000443 (2009)

    Article  MathSciNet  Google Scholar 

  26. Psiuk-Maksymowicz, K., et al.: A holistic approach to testing biomedical hypotheses and analysis of biomedical data. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds.) Beyond Databases, Architectures and Structures. Advanced Technologies for Data Mining and Knowledge Discovery. CCIS, vol. 613, pp. 449–462. Springer, Cham (2016)

    Chapter  Google Scholar 

  27. Psiuk-Maksymowicz, K., et al.: Scalability of a genomic data analysis in the biotest platform. In: Nguyen, N., Tojo, S., Nguyen, L., Trawinśki, B. (eds.) Intelligent Information and Database Systems. LNCS, vol. 10192, pp. 741–752. Springer, Cham (2017)

    Chapter  Google Scholar 

  28. Resnik, P.: Using information content to evalutate semantic similarity in a taxonomy. In: IJCAI 1995, vol. 1, pp. 448–453, Montreal, Canada (1995)

    Google Scholar 

  29. Ritchie, M.D., et al.: Methods of integrating data to uncover genotype-phenotype interactions. Nat. Rev. Genet. 16(2), 85–97 (2015)

    Article  Google Scholar 

  30. Schoenborn, J., Wilson, C.: Regulation of interferon-\(\gamma \) during innate and adaptive immune responses. Adv. Immunol. 96(96), 41–101 (2007)

    Article  Google Scholar 

  31. Speer, N., et al.: Spectral clustering Gene Ontology terms to group genes by function. In: Casadio, R., Myers, G. (eds.) Algorithms in Bioinformatics. LNCS, vol. 3692, pp. 1–12. Springer, Berlin, Heidelberg (2005)

    Chapter  Google Scholar 

  32. Wang, J.Z., et al.: A new method to measure the semantic similarity of GO terms. Bioinformatics 23(10), 1274–1281 (2007)

    Article  Google Scholar 

  33. Yu, G., et al.: GOSemSim: an R package for measuring semantic similarity among GO terms and gene products. Bioinformatics 26(7), 976–978 (2010)

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by The National Centre for Research and Development grant No PBS3/B3/32/2015 and was carried out in part within the statutory research project of the Institute of Informatics (RAU2). Presented system was developed and installed on the infrastructure of the Ziemowit computer cluster (www.ziemowit.hpc.polsl.pl) in the Laboratory of Bioinformatics and Computational Biology, The Biotechnology, Bioengineering and Bioinformatics Centre Silesian BIO-FARMA, created in the POIG.02.01.00-00-166/08 and expanded in the POIG.02.03.01-00-040/13 projects.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aleksandra Gruca .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Gruca, A., Jaksik, R., Psiuk-Maksymowicz, K. (2018). Functional Interpretation of Gene Sets: Semantic-Based Clustering of Gene Ontology Terms on the BioTest Platform. In: Gruca, A., CzachĂłrski, T., Harezlak, K., Kozielski, S., Piotrowska, A. (eds) Man-Machine Interactions 5. ICMMI 2017. Advances in Intelligent Systems and Computing, vol 659. Springer, Cham. https://doi.org/10.1007/978-3-319-67792-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67792-7_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67791-0

  • Online ISBN: 978-3-319-67792-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics