Systems Biology: Generating and Understanding Big Data

  • Stephanie S. Kim
  • Timothy R. DonahueEmail author
Part of the Success in Academic Surgery book series (SIAS)


Systems biology is the study of complex biological systems from a holistic, big-picture view. Advancement in biological research techniques to generate more data efficiently has facilitated a surge in systems biology, which relies on analysis of large datasets to elucidate a cell’s genome, transcriptome, proteome, and metabolome. Large biological datasets are generated from high-throughput experiments, such as microarrays, mass spectrometry, and high-throughput drug screening. Many datasets from previous experiments done by various laboratories and organizations are available in numerous online portals and can provide valuable information. Analysis of data from genomic, transcriptomic, proteomic, and metabolomic experiments can elucidate changes caused by perturbations like disease process and therapeutic interventions. Although each type of “omics” dataset on its own can provide important insights, integrating data from multiple omics experiments and dimensions (e.g., genome and proteome) can provide a better understanding of how different dimensions of biology are coordinated with each other. This can lead to comprehensive information on causes and effects of a disease process and effectiveness and resistance to therapies.


Systems biology Bioinformatics Databases High-throughput experiments Omics experiments Data analysis 


  1. 1.
    Westerhoff HV, Pallsson BO. The evolution of molecular biology into systems biology. Nat Biotechnol. 2004;22(10):1249–52.CrossRefGoogle Scholar
  2. 2.
    Lander ES. Initial impact of the sequencing of the human genome. Nature. 2011;470(7333):187–97.CrossRefGoogle Scholar
  3. 3.
    Ayers D, Day PJ. Systems medicine: the application of systems biology approaches for modern medical research and drug development. Mol Biol Int. 2015;2015:698169.CrossRefGoogle Scholar
  4. 4.
    The 1000 Genomes Project Consortium, et al. A global reference for human genetic variation. Nature. 2015;526(7571):68–74.CrossRefGoogle Scholar
  5. 5.
    Kolesnikov N, et al. ArrayExpress update-simplifying data submissions. Nucleic Acids Res. 2015;43(Database issue):D1113–6.CrossRefGoogle Scholar
  6. 6.
    Sherry ST, et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29(1):308–11.CrossRefGoogle Scholar
  7. 7.
    Barrett T, Se W, Ledoux P, et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 2013. Jan;4:991–5.Google Scholar
  8. 8.
    Krupp M, Marquardt JU, Sahin U, et al. RNA-Seq Atlas—a reference database for gene expression profiling in normal tissue by next-generation sequencing. Bioinformatics. 2012;8:1184–5.CrossRefGoogle Scholar
  9. 9.
    Peri S, Navarro JD, Kristiansen TZ, et al. Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res. 2004;32:D497–501.CrossRefGoogle Scholar
  10. 10.
    Hermjakob H, Montecchi-Palazzi L, Lewington C, et al. IntAct: an open source molecular interaction database. Nucleic Acids Res. 2004;32:D42–455.CrossRefGoogle Scholar
  11. 11.
    Hulo N, Bairoch A, Bulliard V, et al. The 20 years of PROSITE. Nucleic Acids Res. 2008;36:D245–9.CrossRefGoogle Scholar
  12. 12.
    Berman HM, Westbrook J, Feng Z, et al. The Protein Data Bank. Nucleic Acids Res. 2000;28(1):235–42.CrossRefGoogle Scholar
  13. 13.
    The Uniprot Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2017;45:158–69.CrossRefGoogle Scholar
  14. 14.
    Thul PJ, Lindskog C. The human protein atlas: a spatial map of the human proteome. Protein Sci. 2018;27(1):233–44.CrossRefGoogle Scholar
  15. 15.
    Guijas C, Montenegro-Burke JR, Domingo-Almenara X, et al. METLIN: a technology platform for identifying knowns and unknowns. Anal Chem. 2018;90(5):3156–64.CrossRefGoogle Scholar
  16. 16.
    Frolkis A, Knox C, Lim E, et al. SMPDB: the small molecule pathway database. Nucleic Acid Res. 2010;38:D480–7.CrossRefGoogle Scholar
  17. 17.
    Wishart DS, Tzur D, Knox C, et al. HMDB: the human metabolome database. Nucleic Acid Res. 2007;35:D521–6.CrossRefGoogle Scholar
  18. 18.
    Artimo P, Jonnalagedda M, Arnold K, et al. ExPASy: SIB bioinformatics resource portal. Nucleic Acids Res. 2012;40(Web Server issue):W597–603.CrossRefGoogle Scholar
  19. 19.
    Kanehisa M, Goto S, Sato Y, et al. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 2014;42:D199–205.CrossRefGoogle Scholar
  20. 20.
    Gaulton A, Bellis LJ, Bento AP, et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;40(Database issue):D1100–7.CrossRefGoogle Scholar
  21. 21.
    Wishart DS, Knox C, Guo AC, et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 2008;36(Database issue):D901–6.CrossRefGoogle Scholar
  22. 22.
    Amberger JS, Bocchini CA, Schiettecatte F, et al. online Mendelian inheritance in man (OMIM), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 2015;43(Database issue):D789–98.CrossRefGoogle Scholar
  23. 23.
    Metzker ML. Sequencing technologies – the next generation. Nat Rev Genet. 2010;11(1):31–46.CrossRefGoogle Scholar
  24. 24.
    Kumar RM. The widely used diagnostics of “DNA microarray” – a review. Am J Infect Dis. 2009;5(3):207–18.CrossRefGoogle Scholar
  25. 25.
    Cancer Genome Atlas Research Network. Integrated genomic characterization of pancreatic ductal adenocarcinoma. Cancer Cell. 2017;32(2):185–203.CrossRefGoogle Scholar
  26. 26.
    Downard K. Mass spectrometry’s beginnings. In: Downard K, editor. Mass spectrometry: a foundation course. London: Royal Society of Chemistry; 2004. p. 1–9.Google Scholar
  27. 27.
    Key M. A tutorial in displaying mass-spectrometry-based proteomic data using heat maps. BMC Bioinformatics. 2012;13(Suppl 16):S10.CrossRefGoogle Scholar
  28. 28.
    Altaf-Ul-Amin MD, Afendi FM, Kiboi SK, et al. Systems biology in the context of big data and networks. Biomed Res Int. 2014;2014:428570.PubMedPubMedCentralGoogle Scholar
  29. 29.
    Dunkler D, Sanchez-Cabo F, Heinze G. Statistical analysis principles for omics data. Methods Mol Biol. 2011;719:113–31.CrossRefGoogle Scholar
  30. 30.
    Hasin Y, Seldin M, Lusis A. Multi-omics approaches to disease. Genome Biology. 2017;18(1):83.CrossRefGoogle Scholar
  31. 31.
    Jha AK, Huang SC, Sergushichev A, et al. Network integration of parallel metabolic and transcriptional data reveals metabolic modules that regulate macrophage polarization. Immunity. 2015;42(3):419–30.CrossRefGoogle Scholar
  32. 32.
    Markossian S, Ang KK, Wilson CG, et al. Small-molecule screening for genetic diseases. Annu Rev Genomics Hum Genet. 2018;19:263–88.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of SurgeryUniversity of California Los AngelesLos AngelesUSA
  2. 2.Department of Molecular and Medical PharmacologyUniversity of California Los AngelesLos AngelesUSA
  3. 3.Jonsson Comprehensive Cancer CenterUniversity of California Los AngelesLos AngelesUSA

Personalised recommendations