Skip to main content

Computational Biomarker Discovery

  • Chapter
  • First Online:
Approaches in Integrative Bioinformatics
  • 2251 Accesses

Abstract

The advent of omics technologies as genomics and proteomics has brought the hope of discovering novel biomarkers that can be used to diagnose, predict, and monitor progress of disease. The importance of computational biomarker discovery for diagnostic classification and prognostic assessment in the context of microarray and proteomic data has been increasingly recognized. We present an overview of computational methods and their applications to biomarker discovery with particular focus on genomics and proteomics data. One case study is exemplarily presented, and relevant computational biomarker discovery terminology and techniques are explained.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Soreide K (2009) Receiver-operating characteristic curve analysis in diagnostic, prognostic and predictive biomarker research. J Clin Pathol 62(1):1–5

    Article  Google Scholar 

  2. Jaffe CC (2009) Pathology and imaging in biomarker development. Arch Pathol Lab Med 133(4):547–549

    Google Scholar 

  3. Rhodes DR, Sanda MG, Otte AP, Chinnaiyan AM, Rubin MA (2003) Multiplex biomarker approach for determining risk of prostate-specific antigen-defined recurrence of prostate cancer. J Natl Cancer Inst 95(9):661–668

    Article  Google Scholar 

  4. Allison DB, Cui X, Page GP, Sabripour M (2006) Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet 7(1):55–65

    Article  Google Scholar 

  5. Reimers M (2010) Making informed choices about microarray data analysis. PLoS Comput Biol 6(5):e1000786

    Article  Google Scholar 

  6. Slonim DK, Yanai I (2009) Getting started in gene expression microarray analysis. PLoS Comput Biol 5(10):e1000543

    Article  Google Scholar 

  7. Sørlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, Van De Rijn M, Jeffrey SS (2001) Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 98(19):10869–10874

    Article  Google Scholar 

  8. Giltnane JM, Rimm DL (2004) Technology insight: identification of biomarkers with tissue microarray technology. Nat Clin Pract Oncol 1(2):104–111

    Article  Google Scholar 

  9. Segal E, Friedman N, Kaminski N, Regev A, Koller D (2005) From signatures to models: understanding cancer using microarrays. Nat Genet 37:S38–S45

    Article  Google Scholar 

  10. Potti A, Dressman HK, Bild A, Riedel RF, Chan G, Sayer R, Cragun J, Cottrill H, Kelley MJ, Petersen R (2006) Genomic signatures to guide the use of chemotherapeutics. Nat Med 12(11):1294–1300

    Article  Google Scholar 

  11. Huang DW, Sherman BT, Lempicki RA (2009) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res 37(1):1–13

    Article  Google Scholar 

  12. Khatri P, Draghici S (2005) Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21(18):3587–3595

    Article  Google Scholar 

  13. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci 102(43):15545–15550

    Article  Google Scholar 

  14. Glez-Pena D, Gomez-Lopez G, Pisano DG, Fdez-Riverola F (2009) WhichGenes: a web-based tool for gathering, building, storing and exporting gene sets with application in gene set enrichment analysis. Nucleic Acids Res 37(Web Server Issue):W329–W334

    Article  Google Scholar 

  15. Medina I, Montaner D, Bonifaci N, Pujana MA, Carbonell J, Tarraga J, Al-Shahrour F, Dopazo J (2009) Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies. Nucleic Acids Res 37(Web Server Issue):W340–W344

    Article  Google Scholar 

  16. Dennis G Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA (2003) DAVID: database for annotation, visualization, and integrated discovery. Genome Biol 4(9):R60

    Article  Google Scholar 

  17. Pujana MA, Han JDJ, Starita LM, Stevens KN, Tewari M, Ahn JS, Rennert G, Moreno V, Kirchhoff T, Gold B (2007) Network modeling links breast cancer susceptibility and centrosome dysfunction. Nat Genet 39(11):1338–1349

    Article  Google Scholar 

  18. Chuang HY, Lee E, Liu YT, Lee D, Ideker T (2007) Network-based classification of breast cancer metastasis. Mol Syst Biol 3(1):140–149

    Google Scholar 

  19. Lupski JR, Reid JG, Gonzaga-Jauregui C, Rio Deiros D, Chen DC, Nazareth L, Bainbridge M, Dinh H, Jing C, Wheeler DA et al (2010) Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy. N Engl J Med 362(13):1181–1191

    Article  Google Scholar 

  20. Roach JC, Glusman G, Smit AF, Huff CD, Hubley R, Shannon PT, Rowen L, Pant KP, Goodman N, Bamshad M et al (2010) Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328(5978):636–639

    Article  Google Scholar 

  21. Chan D (2006) Clinical proteomics. Clin Proteomics 2(1):1–4

    Article  Google Scholar 

  22. Hanash S (2004) Moving forward with clinical proteomics. Clin Proteomics 1(1):3–5

    Article  Google Scholar 

  23. Mischak H, Apweiler R, Banks RE, Conaway M, Coon J, Dominiczak A, Ehrich JHH, Fliser D, Girolami M, Hermjakob H et al (2007) Clinical proteomics: a need to define the field and to begin to set adequate standards. Proteomics Clin Appl 1(2):148–156

    Article  Google Scholar 

  24. Klampfl CW (2004) Review coupling of capillary electrochromatography to mass spectrometry. J Chromatogr A 1044(1–2):131–144

    Article  Google Scholar 

  25. Frohlich T, Arnold GJ (2006) Proteome research based on modern liquid chromatography–tandem mass spectrometry: separation, identification and quantification. J Neural Transm 113(8):973–994

    Article  Google Scholar 

  26. Mbeunkui F, Metge BJ, Shevde LA, Pannell LK (2007) Identification of differentially secreted biomarkers using LC-MS/MS in isogenic cell lines representing a progression of breast cancer. J Proteome Res 6(8):2993–3002

    Article  Google Scholar 

  27. Needham CJ, Bradford JR, Bulpitt AJ, Westhead DR (2006) Inference in Bayesian networks. Nat Biotechnol 24(1):51–53

    Article  Google Scholar 

  28. Lai KC, Chiang HC, Chen WC, Tsai FJ, Jeng LB (2008) Artificial neural network-based study can predict gastric cancer staging. Hepatogastroenterology 55(86–87):1859–1863

    Google Scholar 

  29. Amiri Z, Mohammad K, Mahmoudi M, Zeraati H, Fotouhi A (2008) Assessment of gastric cancer survival: using an artificial hierarchical neural network. Pak J Biol Sci 11(8):1076–1084

    Article  Google Scholar 

  30. Chi CL, Street WN, Wolberg WH (2007) Application of artificial neural network-based survival analysis on two breast cancer datasets. AMIA Annu Symp Proc 2007:130–134

    Google Scholar 

  31. Anagnostopoulos I, Maglogiannis I (2006) Neural network-based diagnostic and prognostic estimations in breast cancer microscopic instances. Med Biol Eng Comput 44(9):773–784

    Article  Google Scholar 

  32. Wang HQ, Wong HS, Zhu H, Yip TT (2009) A neural network-based biomarker association information extraction approach for cancer classification. J Biomed Inform 42(4):654–666

    Article  Google Scholar 

  33. Meyer D, Leisch F, Hornik K (2003) The support vector machine under test. Neurocomputing 55(1–2):169–186

    Article  Google Scholar 

  34. Vapnik VN (1998) Statistical learning theory. Springer, New York

    MATH  Google Scholar 

  35. Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M et al (2004) UniProt: the Universal Protein knowledgebase. Nucleic Acids Res 32(Database Issue):D115–D119

    Article  Google Scholar 

  36. Chen J, Aronow BJ, Jegga AG (2009) Disease candidate gene identification and prioritization using protein interaction networks. BMC Bioinforma 10:73

    Article  Google Scholar 

  37. Kohler S, Bauer S, Horn D, Robinson PN (2008) Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet 82(4):949–958

    Article  Google Scholar 

  38. Oti M, Snel B, Huynen MA, Brunner HG (2006) Predicting disease genes using protein-protein interactions. J Med Genet 43(8):691–698

    Article  Google Scholar 

  39. Chen JY, Shen C, Sivachenko AY (2006) Mining Alzheimer disease relevant proteins from integrated protein interactome data. Pac Symp Biocomput 2006:367–378

    Google Scholar 

  40. Xu J, Li Y (2006) Discovering disease-genes by topological features in human protein-protein interaction network. Bioinformatics 22(22):2800–2805

    Article  Google Scholar 

  41. Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA (2005) Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 33(Database Issue):D514–D517

    Article  Google Scholar 

  42. Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web. Technical report, Stanford InfoLab, Nov 1999

    Google Scholar 

  43. Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM 46(5):604–632

    Article  MATH  MathSciNet  Google Scholar 

  44. Wu X, Pandey R, Chen JY (2009) Network topological reordering revealing systemic patterns in yeast protein interaction networks. Conf Proc IEEE Eng Med Biol Soc 2009:6954–6957

    Google Scholar 

  45. Huang H, Li J, Chen JY (2009) Disease gene-fishing in molecular interaction networks: a case study in colorectal cancer. Conf Proc IEEE Eng Med Biol Soc 2009:6416–6419

    Google Scholar 

  46. Goymer P (2007) Cancer genetics: networks uncover new cancer susceptibility suspect. Nat Rev Genet 8:823

    Google Scholar 

  47. Ergün A, Lawrence CA, Kohanski MA, Brennan TA, Collins JJ (2007) A network biology approach to prostate cancer. Mol Syst Biol 3:82

    Article  Google Scholar 

  48. Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res 32(Database Issue):D277–D280

    Article  Google Scholar 

  49. Matthews L, Gopinath G, Gillespie M, Caudy M, Croft D, de Bono B, Garapati P, Hemish J, Hermjakob H, Jassal B et al (2009) Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res 37(Database Issue):D619–D622

    Article  Google Scholar 

  50. Bader GD, Cary MP, Sander C (2006) Pathguide: a pathway resource list. Nucleic Acids Res 34(Database Issue):D504–D506

    Article  Google Scholar 

  51. Karp PD, Ouzounis CA, Moore-Kochlacs C, Goldovsky L, Kaipa P, Ahren D, Tsoka S, Darzentas N, Kunin V, Lopez-Bigas N (2005) Expansion of the BioCyc collection of pathway/genome databases to 160 genomes. Nucleic Acids Res 33(19):6083–6089

    Article  Google Scholar 

  52. Schaefer CF, Anthony K, Krupa S, Buchoff J, Day M, Hannay T, Buetow KH (2009) PID: the pathway interaction database. Nucleic Acids Res 37(Database Issue):D674–D679

    Article  Google Scholar 

  53. Rao PN, Levine E, Myers MO, Prakash V, Watson J, Stolier A, Kopicko JJ, Kissinger P, Raj SG, Raj MH (1999) Elevation of serum riboflavin carrier protein in breast cancer. Cancer Epidemiol Biomarkers Prev 8(11):985–990

    Google Scholar 

  54. Dua RS, Isacke CM, Gui GPH (2006) The intraductal approach to breast cancer biomarker discovery. J Clin Oncol 24(7):1209–1216

    Article  Google Scholar 

  55. Ou K, Yu K, Kesuma D, Hooi M, Huang N, Chen W, Lee SY, Goh XP, Tan LK, Liu J et al (2008) Novel breast cancer biomarkers identified by integrative proteomic and gene expression mapping. J Proteome Res 7(4):1518–1528

    Article  Google Scholar 

  56. Alexander H, Stegner AL, Wagner-Mann C, Du Bois GC, Alexander S, Sauter ER (2004) Proteomic analysis to identify breast cancer biomarkers in nipple aspirate fluid. Clin Cancer Res 10(22):7500–7510

    Article  Google Scholar 

  57. Adam PJ, Boyd R, Tyson KL, Fletcher GC, Stamps A, Hudson L, Poyser HR, Redpath N, Griffiths M, Steers G et al (2003) Comprehensive proteomic analysis of breast cancer cell membranes reveals unique proteins with potential roles in clinical cancer. J Biol Chem 278(8):6482–6489

    Article  Google Scholar 

  58. Bullinger D, Neubauer H, Fehm T, Laufer S, Gleiter CH, Kammerer B (2007) Metabolic signature of breast cancer cell line MCF-7: profiling of modified nucleosides via LC-IT MS coupling. BMC Biochem 8:25

    Article  Google Scholar 

  59. Kulasingam V, Diamandis EP (2007) Proteomics analysis of conditioned media from three breast cancer cell lines: a mine for biomarkers and therapeutic targets. Mol Cell Proteomics 6(11):1997–2011

    Article  Google Scholar 

  60. Xiang R, Shi Y, Dillon DA, Negin B, Horvath C, Wilkins JA (2004) 2D LC/MS analysis of membrane proteins from breast cancer cell lines MCF7 and BT474. J Proteome Res 3(6):1278–1283

    Article  Google Scholar 

  61. Burdall S, Hanby A, Lansdown M, Speirs V (2003) Breast cancer cell lines: friend or foe? Breast Cancer Res 5(2):89–95

    Article  Google Scholar 

  62. Higgs RE, Knierman MD, Gelfanova V, Butler JP, Hale JE (2005) Comprehensive label-free method for the relative quantification of proteins from biological samples. J Proteome Res 4(4):1442–1450

    Article  Google Scholar 

  63. Berishaj M, Gao SP, Ahmed S, Leslie K, Al-Ahmadie H, Gerald WL, Bornmann W, Bromberg JF (2007) Stat3 is tyrosine-phosphorylated through the interleukin-6/glycoprotein 130/Janus kinase pathway in breast cancer. Breast Cancer Res 9(3):R32

    Article  Google Scholar 

  64. Hu H, Lee HJ, Jiang C, Zhang J, Wang L, Zhao Y, Xiang Q, Lee EO, Kim SH, Lu J (2008) Penta-1,2,3,4,6-O-galloyl-beta-D-glucose induces p53 and inhibits STAT3 in prostate cancer cells in vitro and suppresses prostate xenograft tumor growth in vivo. Mol Cancer Ther 7(9):2681–2691

    Article  Google Scholar 

  65. Song H, Jin X, Lin J (2004) Stat3 upregulates MEK5 expression in human breast cancer cells. Oncogene 23(50):8301–8309

    Article  Google Scholar 

  66. Nielsen NR, Gronbaek M (2006) Stress and breast cancer: a systematic update on the current knowledge. Nat Clin Pract Oncol 3(11):612–620

    Article  Google Scholar 

  67. Zhang F, Chen JY (2010) Discovery of pathway biomarkers from coupled proteomics and systems biology methods. BMC Genomics 11(Suppl 2):S12

    Article  Google Scholar 

  68. Ideker T (2004) Systems biology 101: what you need to know. Nat Biotechnol 22(4):473–475

    Article  Google Scholar 

  69. Balmain A, Gray J, Ponder B (2003) The genetics and genomics of cancer. Nat Genet 33(3 s):238–244

    Article  Google Scholar 

  70. Easton DF, Pooley KA, Dunning AM, Pharoah PD, Thompson D, Ballinger DG, Struewing JP, Morrison J, Field H, Luben R (2007) Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447(7148):1087–1095

    Article  Google Scholar 

  71. Gold B, Kirchhoff T, Stefanov S, Lautenberger J, Viale A, Garber J, Friedman E, Narod S, Olshen AB, Gregersen P (2008) Genome-wide association study provides evidence for a breast cancer risk locus at 6q22. 33. Proc Natl Acad Sci 105(11):4340

    Article  Google Scholar 

  72. Barabasi AL, Oltvai ZN (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5(2):101–113

    Article  Google Scholar 

  73. Goldberger AL, Amaral LAN, Hausdorff JM, Ivanov PC, Peng CK, Stanley HE (2002) Fractal dynamics in physiology: alterations with disease and aging. Proc Natl Acad Sci 99(90001):2466–2472

    Article  Google Scholar 

  74. Amaral LAN, Diaz-Guilera A, Moreira AA, Goldberger AL, Lipsitz LA, Kopell NJ (2004) Emergence of complex dynamics in a simple model of signaling networks. Proc Natl Acad Sci U S A 101(44):15551–15555

    Article  MATH  Google Scholar 

  75. Costa M, Goldberger AL, Peng CK (2005) Broken asymmetry of the human heartbeat: loss of time irreversibility in aging and disease. Phys Rev Lett 95(19):198102–198105

    Article  Google Scholar 

  76. Goldberger AL, Moody GB, Peng CK (2006) Techniques, applications and future directions, Heart Rate Viability 2006 Workshop, 20–23 April 2006

    Google Scholar 

  77. Tatsumi J, Yamauchi A, Kono Y (1989) Fractal analysis of plant root systems. Ann Bot 64(5):499

    Google Scholar 

  78. Palmer MW (1988) Fractal geometry: a tool for describing spatial patterns of plant communities. Plant Ecol 75(1):91–102

    Article  Google Scholar 

  79. Peitgen HO, Jugens H, Saupe D (2004) Chaos and fractals: new frontiers of science. Springer, New York

    Google Scholar 

  80. Auffray C (2007) Protein subnetwork markers improve prediction of cancer outcome. Mol Syst Biol 3:141–142

    Article  Google Scholar 

  81. Nolan GP (2007) What’s wrong with drug screening today. Nat Chem Biol 3:187–191

    Article  Google Scholar 

  82. McCarthy N (2007) Tumour profiling: networking, protein style. Nat Rev Cancer 7:892–893

    Google Scholar 

  83. Morrison JL, Breitling R, Higham DJ, Gilbert DR (2005) GeneRank: using search engine technology for the analysis of microarray experiments. BMC Bioinforma 6(1):233

    Article  Google Scholar 

  84. Bar-Joseph Z, Gifford DK, Jaakkola TS (2001) Fast optimal leaf ordering for hierarchical clustering. Bioinformatics 17(Suppl 1):S22–S29

    Article  Google Scholar 

  85. Kim SK, Lund J, Kiraly M, Duke K, Jiang M, Stuart JM, Eizinger A, Wylie BN, Davidson GS (2001) A gene expression map for Caenorhabditis elegans. Science 293(5537):2087–2092

    Article  Google Scholar 

  86. You Q, Fang S, Chen JY (2008) GeneTerrain: visual exploration of differential gene expression profiles organized in native biomolecular interaction networks. Inf Vis 9(1):1–12. doi:10.1057

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by a grant from the National Cancer Institute (U24CA126480-01), part of NCI’s Clinical Proteomic Technologies Initiative (http://proteomics.cancer.gov), awarded to Dr. Fred Regnier (PI) and Dr. Jake Chen (co-PI).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jake Y. Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Zhang, F., Wu, X., Chen, J.Y. (2014). Computational Biomarker Discovery. In: Chen, M., Hofestädt, R. (eds) Approaches in Integrative Bioinformatics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41281-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41281-3_13

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41280-6

  • Online ISBN: 978-3-642-41281-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics