Advertisement

Introduction to Functional Bioinformatics

  • Peter Natesan Pushparaj
Chapter

Abstract

Functional bioinformatics is a part of computational biology that uses the enormous wealth of raw data derived from genomics, transcriptomics, proteomics, glycomics, lipidomics, metabolomics and other large-scale “Omics” experiments to decode the complex gene and protein functions and interactions in health and disease. Importantly, the number of publications in the area of functional bioinformatics has been increased in the last 20 years. In functional genomics, the roles of genes are identified using high-throughput technologies such as the microarrays and next-generation sequencing (NGS) approaches. Functional bioinformatics decodes how genomes, proteomes and metabolomes result in different cellular phenotypes and analyses differences in how the same genome functions differently in diverse cell types and how changes in genomes alter both cellular and molecular functions through differential expression of transcripts or genes (DEGs) which in turn regulate the expression of proteins and metabolites in the cells. Various computational tools are deployed in functional bioinformatics to decipher complex biological information in diverse datasets to generate precise biological understanding and hypotheses about gene functions, protein expression, interactions and regulations in both health and disease. In this chapter, I will introduce the high-throughput technologies and the data analysis strategies currently used in the area of functional bioinformatics.

Keywords

Functional bioinformatics Functional genomics Microarrays Next-generation sequencing Differentially expressed genes Computational tools Health and disease 

Abbreviations

AE

ArrayExpress

DAVID

Database for Annotation, Visualization and Integrated Discovery

DEGs

Differentially Expressed Genes

GEO

Gene Expression Omnibus

GO

Gene Ontology

GSEA

Gene Set Enrichment Analysis

HPC

High-Performance Computing

IPA

Ingenuity Pathway Analysis

KEGG

Kyoto Encyclopedia of Genes and Genomes

NGS

Next Generation Sequencing

RMA

Robust Multi-array Average

RNAseq

RNA Sequencing

TAC

Transcriptome Analysis Console

Notes

Acknowledgements

This work is funded by the National Plan for Science, Technology and Innovation (MAARIFAH)-King Abdulaziz City for Science and Technology, The Kingdom of Saudi Arabia, award number 12-BIO2719-03. The authors also acknowledge with thanks the Science and Technology Unit (STU), King Abdulaziz University, for their excellent technical support.

References

  1. Abu-Elmagd M, Alghamdi MA, Shamy M, Khoder MI, Costa M, Assidi M et al (2017) Evaluation of the effects of airborne particulate matter on Bone Marrow-Mesenchymal Stem Cells (BM-MSCs): cellular, molecular and systems biological approaches. Int J Environ Res Public Health 14(4):440PubMedCentralGoogle Scholar
  2. An J, Lai J, Wood DL, Sajjanhar A, Wang C, Tevz G et al (2015) RNASeq browser: a genome browser for simultaneous visualization of raw strand specific RNAseq reads and UCSC genome browser custom tracks. BMC Genomics 16:145PubMedPubMedCentralGoogle Scholar
  3. Ashburner M, Lewis S (2002) On ontologies for biologists: the Gene Ontology--untangling the web. Novartis Found Symp 247:66–80; discussion −3, 4–90, 244–52PubMedGoogle Scholar
  4. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium Nat Genet 25(1):25–29PubMedGoogle Scholar
  5. Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C et al (2001) Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet 29(4):365–371PubMedPubMedCentralGoogle Scholar
  6. Bunnik EM, Le Roch KG (2013) An introduction to functional genomics and systems biology. Adv Wound Care (New Rochelle) 2(9):490–498Google Scholar
  7. Carithers LJ, Moore HM (2015) The genotype-tissue expression (GTEx) project. Biopreserv Biobank 13(5):307–308PubMedPubMedCentralGoogle Scholar
  8. Casper J, Zweig AS, Villarreal C, Tyner C, Speir ML, Rosenbloom KR et al (2018) The UCSC Genome Browser database: 2018 update. Nucleic Acids Res 46(D1):D762–D7D9PubMedGoogle Scholar
  9. Consortium GT (2013) The genotype-tissue expression (GTEx) project. Nat Genet 45(6):580–585Google Scholar
  10. Didelot A, Kotsopoulos SK, Lupo A, Pekin D, Li X, Atochin I et al (2013) Multiplex picoliter-droplet digital PCR for quantitative assessment of DNA integrity in clinical samples. Clin Chem 59(5):815–823PubMedGoogle Scholar
  11. Eijssen LM, Goelela VS, Kelder T, Adriaens ME, Evelo CT, Radonjic M (2015) A user-friendly workflow for analysis of Illumina gene expression bead array data available at the arrayanalysis.org portal. BMC Genomics 16:482PubMedPubMedCentralGoogle Scholar
  12. Elnitski LL, Shah P, Moreland RT, Umayam L, Wolfsberg TG, Baxevanis AD (2007) The ENCODEdb portal: simplified access to ENCODE consortium data. Genome Res 17(6):954–959PubMedPubMedCentralGoogle Scholar
  13. Fabregat A, Sidiropoulos K, Viteri G, Forner O, Marin-Garcia P, Arnau V et al (2017) Reactome pathway analysis: a high-performance in-memory approach. BMC Bioinformatics 18(1):142PubMedPubMedCentralGoogle Scholar
  14. Fried JY, van Iersel MP, Aladjem MI, Kohn KW, Luna A (2013) PathVisio-faceted search: an exploration tool for multi-dimensional navigation of large pathways. Bioinformatics 29(11):1465–1466PubMedPubMedCentralGoogle Scholar
  15. Gerstein M (2012) Genomics: ENCODE leads the way on big data. Nature 489(7415):208PubMedGoogle Scholar
  16. Grant GR, Manduchi E, Stoeckert CJ Jr (2007) Analysis and management of microarray gene expression data. Curr Protoc Mol Biol Chapter 19:Unit 19 6Google Scholar
  17. Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R et al (2004) The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res 32(Database issue):D258–D261PubMedGoogle Scholar
  18. Hieter P, Boguski M (1997) Functional genomics: it’s all how you read it. Science 278(5338):601–602PubMedGoogle Scholar
  19. Hochberg Y, Benjamini Y (1990) More powerful procedures for multiple significance testing. Stat Med 9(7):811–818PubMedGoogle Scholar
  20. Huang DW, Sherman BT, Zheng X, Yang J, Imamichi T, Stephens R et al (2009) Extracting biological meaning from large gene lists with DAVID. Curr Protoc Bioinformatics Chapter 13:Unit 13 1Google Scholar
  21. Huang DW, Sherman BT, Lempicki RA (2009a) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4(1):44–57Google Scholar
  22. Huang DW, Sherman BT, Lempicki RA (2009b) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res 37(1):1–13Google Scholar
  23. Hung JH, Weng Z (2016) Visualizing genomic annotations with the UCSC Genome Browser. Cold Spring Harb Protoc 2016(11)Google Scholar
  24. Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP (2003a) Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 31(4):e15PubMedPubMedCentralGoogle Scholar
  25. Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U et al (2003b) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2):249–264PubMedGoogle Scholar
  26. Kalamegam G, Pushparaj PN, Khan F, Sait KH, Anfinan N, Al-Qahtani M (2015) Primary ovarian cancer cell inhibition by human Wharton’s Jelly stem cells (hWJSCs): mapping probable mechanisms and targets using systems oncology. Bioinformation 11(12):529–534PubMedPubMedCentralGoogle Scholar
  27. Keen JC, Moore HM (2015) The genotype-tissue expression (GTEx) project: linking clinical data with molecular analysis to advance personalized medicine. J Pers Med 5(1):22–29PubMedPubMedCentralGoogle Scholar
  28. Klipper-Aurbach Y, Wasserman M, Braunspiegel-Weintrob N, Borstein D, Peleg S, Assa S et al (1995) Mathematical formulae for the prediction of the residual beta cell function during the first two years of disease in children and adolescents with insulin-dependent diabetes mellitus. Med Hypotheses 45(5):486–490PubMedGoogle Scholar
  29. Koschmieder A, Zimmermann K, Trissl S, Stoltmann T, Leser U (2012) Tools for managing and analyzing microarray data. Brief Bioinform 13(1):46–60PubMedGoogle Scholar
  30. Koscielny G, An P, Carvalho-Silva D, Cham JA, Fumis L, Gasparyan R et al (2017) Open targets: a platform for therapeutic target identification and validation. Nucleic Acids Res 45(D1):D985–DD94PubMedGoogle Scholar
  31. Kutmon M, van Iersel MP, Bohler A, Kelder T, Nunes N, Pico AR et al (2015) PathVisio 3: an extendable pathway analysis toolbox. PLoS Comput Biol 11(2):e1004085PubMedPubMedCentralGoogle Scholar
  32. Lee R (2013) An introduction to the UCSC Genome Browser. WormBook:1–2Google Scholar
  33. Lockhart NC, Weil CJ, Carithers LJ, Koester SE, Little AR, Volpi S et al (2018) Development of a consensus approach for return of pathology incidental findings in the genotype-tissue expression (GTEx) project. J Med Ethics 44:643PubMedGoogle Scholar
  34. Lussier YA, Li H, Maienschein-Cline M (2013) Conquering computational challenges of omics data and post-ENCODE paradigms. Genome Biol 14(8):310PubMedPubMedCentralGoogle Scholar
  35. Mangan ME, Williams JM, Kuhn RM, Lathe WC 3rd (2014) The UCSC Genome Browser: what every molecular biologist should know. Curr Protoc Mol Biol 107:19 9 1–36PubMedPubMedCentralGoogle Scholar
  36. Mehta JP (2011) Microarray analysis of mRNAs: experimental design and data analysis fundamentals. Methods Mol Biol 784:27–40PubMedGoogle Scholar
  37. Mehta JP, Rani S (2011) Software and tools for microarray data analysis. Methods Mol Biol 784:41–53PubMedGoogle Scholar
  38. Miyaoka Y, Chan AH, Conklin BR (2016) Detecting single-nucleotide substitutions induced by genome editing. Cold Spring Harb Protoc 2016(8)Google Scholar
  39. Park E, Williams B, Wold BJ, Mortazavi A (2012) RNA editing in the human ENCODE RNA-seq data. Genome Res 22(9):1626–1633PubMedPubMedCentralGoogle Scholar
  40. Petryszak R, Burdett T, Fiorelli B, Fonseca NA, Gonzalez-Porta M, Hastings E et al (2014) Expression Atlas update–a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments. Nucleic Acids Res 42(Database issue):D926–D932PubMedGoogle Scholar
  41. Quackenbush J (2002) Microarray data normalization and transformation. Nat Genet 32(Suppl):496–501PubMedGoogle Scholar
  42. Rahbar S, Novin MG, Alizadeh E, Shahnazi V, Pashaei-Asl F, AsrBadr YA et al (2017) New insights into the expression profile of MicroRNA-34c and P53 in infertile men spermatozoa and testicular tissue. Cell Mol Biol (Noisy-le-Grand) 63(8):77–83Google Scholar
  43. Rosenbloom KR, Sloan CA, Malladi VS, Dreszer TR, Learned K, Kirkup VM et al (2013) ENCODE data in the UCSC Genome Browser: year 5 update. Nucleic Acids Res 41(Database issue):D56–D63PubMedGoogle Scholar
  44. Ruau D, Ng FS, Wilson NK, Hannah R, Diamanti E, Lombard P et al (2013) Building an ENCODE-style data compendium on a shoestring. Nat Methods 10(10):926PubMedPubMedCentralGoogle Scholar
  45. Siminoff LA, Wilson-Genderson M, Mosavel M, Barker L, Trgina J, Traino HM et al (2018) Impact of cognitive load on family decision Makers’ recall and understanding of donation requests for the genotype-tissue expression (GTEx) project. J Clin Ethics 29(1):20–30PubMedGoogle Scholar
  46. Sloan CA, Chan ET, Davidson JM, Malladi VS, Strattan JS, Hitz BC et al (2016) ENCODE data at the ENCODE portal. Nucleic Acids Res 44(D1):D726–D732PubMedGoogle Scholar
  47. Sorlie T, Wang Y, Xiao C, Johnsen H, Naume B, Samaha RR et al (2006) Distinct molecular mechanisms underlying clinically relevant subtypes of breast cancer: gene expression analyses across three different platforms. BMC Genomics 7:127PubMedPubMedCentralGoogle Scholar
  48. Speir ML, Zweig AS, Rosenbloom KR, Raney BJ, Paten B, Nejad P et al (2016) The UCSC Genome Browser database: 2016 update. Nucleic Acids Res 44(D1):D717–D725PubMedGoogle Scholar
  49. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA et al (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102(43):15545–15550PubMedPubMedCentralGoogle Scholar
  50. Subramanian A, Kuehn H, Gould J, Tamayo P, Mesirov JP (2007) GSEA-P: a desktop application for gene set enrichment analysis. Bioinformatics 23(23):3251–3253PubMedGoogle Scholar
  51. Tamhane AC, Hochberg Y, Dunnett CW (1996) Multiple test procedures for dose finding. Biometrics 52(1):21–37PubMedGoogle Scholar
  52. The Gene Ontology C (2017) Expansion of the gene ontology knowledgebase and resources. Nucleic Acids Res 45(D1):D331–D3D8Google Scholar
  53. Tyner C, Barber GP, Casper J, Clawson H, Diekhans M, Eisenhart C et al (2017) The UCSC Genome Browser database: 2017 update. Nucleic Acids Res 45(D1):D626–DD34PubMedGoogle Scholar
  54. Wang Y, Barbacioru C, Hyland F, Xiao W, Hunkapiller KL, Blake J et al (2006) Large scale real-time PCR validation on gene expression measurements from two commercial long-oligonucleotide microarrays. BMC Genomics 7:59PubMedPubMedCentralGoogle Scholar
  55. Wang J, Zhuang J, Iyer S, Lin XY, Greven MC, Kim BH et al (2013) Factorbook.org: a Wiki-based database for transcription factor-binding data generated by the ENCODE consortium. Nucleic Acids Res 41(Database issue):D171–D176PubMedGoogle Scholar
  56. Wirka RC, Pjanic M, Quertermous T (2018) Advances in transcriptomics: investigating cardiovascular disease at unprecedented resolution. Circ Res 122(9):1200–1220PubMedGoogle Scholar
  57. Yang X, Zhu S, Li L, Zhang L, Xian S, Wang Y et al (2018) Identification of differentially expressed genes and signaling pathways in ovarian cancer by integrated bioinformatics analysis. Onco Targets Ther 11:1457–1474PubMedPubMedCentralGoogle Scholar
  58. Zhong Q, Bhattacharya S, Kotsopoulos S, Olson J, Taly V, Griffiths AD et al (2011) Multiplex digital PCR: breaking the one target per color barrier of quantitative PCR. Lab Chip 11(13):2167–2174PubMedGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Peter Natesan Pushparaj
    • 1
  1. 1.Center of Excellence in Genomic Medicine Research, Faculty of Applied Medical SciencesKing Abdulaziz UniversityJeddahSaudi Arabia

Personalised recommendations