Skip to main content

Data Mining Methods in Omics-Based Biomarker Discovery

  • Protocol
  • First Online:
Bioinformatics for Omics Data

Part of the book series: Methods in Molecular Biology ((MIMB,volume 719))

Abstract

The advent of Omics technologies as genomics and proteomics has brought the hope of discovering novel biomarkers that can be used to diagnose, predict, and monitor the progress of disease. The importance of data mining to identify biological markers for the diagnostic classification and prognostic assessment in the context of microarray and proteomic data has been increasingly recognized. We present an overview of general data mining methods and their applications to biomarker discovery with particular focus on genomics and proteomics data. Two case studies are exemplarily presented, and relevant data mining terminology and techniques are explained.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Soreide K. (2009) Receiver-operating characteristic curve analysis in diagnostic, prognostic and predictive biomarker research. J Clin Pathol 62, 1–5.

    Article  PubMed  Google Scholar 

  2. Jaffe C.C. (2009) Pathology and imaging in biomarker development. Arch Pathol Lab Med 133, 547–9.

    PubMed  Google Scholar 

  3. de Oliveira L.S., Andreao R.V., and Sarcinelli-Filho M. (2010) The use of bayesian networks for heart beat classification. Adv Exp Med Biol 657, 217–31.

    Article  PubMed  Google Scholar 

  4. Kwon S., Cui J., Rhodes S.L., Tsiang D., Rotter J.I., and Guo X. (2009) Application of Bayesian classification with singular value decomposition method in genome-wide association studies. BMC Proc 3, S9.

    Article  PubMed  Google Scholar 

  5. Needham C.J., Bradford J.R., Bulpitt A.J., and Westhead D.R. (2006) Inference in Bayesian networks. Nat Biotechnol 24, 51–3.

    Article  PubMed  CAS  Google Scholar 

  6. Deng X., Geng H., and Ali H.H. (2007) Cross-platform analysis of cancer biomarkers: A Bayesian network approach to incorporating mass spectrometry and microarray data. Cancer Inform 3, 183–202.

    PubMed  Google Scholar 

  7. van Steensel B., Braunschweig U., Filion G.J., Chen M., van Bemmel J.G., and Ideker T. (2010) Bayesian network analysis of targeting interactions in chromatin. Genome Res 20, 190–200.

    Article  PubMed  Google Scholar 

  8. Lai K.C., Chiang H.C., Chen W.C., Tsai F.J., and Jeng L.B. (2008) Artificial neural network-based study can predict gastric cancer staging. Hepatogastroenterology 55, 1859–63.

    PubMed  Google Scholar 

  9. Amiri Z., Mohammad K., Mahmoudi M., Zeraati H., and Fotouhi A. (2008) Assessment of gastric cancer survival: Using an artificial hierarchical neural network. Pac J Biol Sci 11, 1076–84.

    Article  Google Scholar 

  10. Chi C.L., Street W.N., and Wolberg W.H. (2007) Application of artificial neural network-based survival analysis on two breast cancer datasets. AMIA Annu Symp Proc 130–4.

    Google Scholar 

  11. Anagnostopoulos I., and Maglogiannis I. (2006) Neural network-based diagnostic and prognostic estimations in breast cancer microscopic instances. Med Biol Eng Comput 44, 773–84.

    Article  PubMed  Google Scholar 

  12. Wang H.Q., Wong H.S., Zhu H., and Yip T.T. (2009) A neural network-based biomarker association information extraction approach for cancer classification. J Biomed Inform 42, 654–66.

    Article  PubMed  CAS  Google Scholar 

  13. Dolled-Filhart M., Ryden L., Cregger M., Jirstrom K., Harigopal M., Camp R.L., and Rimm D.L. (2006) Classification of breast cancer using genetic algorithms and tissue microarrays. Clin Cancer Res 12, 6459–68.

    Article  PubMed  CAS  Google Scholar 

  14. Su Y., Shen J., Qian H., Ma H., Ji J., Ma L., Zhang W., Meng L., Li Z., Wu J., et al. (2007) Diagnosis of gastric cancer using decision tree classification of mass spectral data. Cancer Sci 98, 37–43.

    Article  PubMed  CAS  Google Scholar 

  15. Kohler S., Bauer S., Horn D., and Robinson P.N. (2008) Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet 82, 949–58.

    Article  PubMed  Google Scholar 

  16. Tian Z., Palmer N., Schmid P., Yao H., Galdzicki M., Berger B., Wu E., Kohane I.S. (2009) A practical platform for blood biomarker study by using global gene expression profiling of peripheral whole blood. PLoS One 4, e5157.

    Article  PubMed  Google Scholar 

  17. You Q., Fang S., and Chen J.Y. (2008) GeneTerrain: Visual exploration of differential gene expression profiles organized in native biomolecular interaction networks. J Inf Vis, doi: 10.1057/palgrave.ivs.9500169.

    Google Scholar 

  18. Liu Z., Guo Z., Tan M. (2008) Constructing tumor progression pathways and biomarker discovery with fuzzy kernel kmeans and DNA methylation data. Cancer Inform 6, 1–7.

    PubMed  Google Scholar 

  19. Lee P.S., and Lee K.H. (2000) Genomic analysis. Curr Opin Biotechnol 11, 171–5.

    Article  PubMed  Google Scholar 

  20. Yang Y., Pospisil P., Iyer L.K., Adelstein S.J., and Kassis A.I. (2008) Integrative genomic data mining for discovery of potential blood-borne biomarkers for early diagnosis of cancer. PLoS One 3, e3661.

    Article  PubMed  Google Scholar 

  21. Fernandez-Suarez X.M., and Birney E. (2008) Advanced genomic data mining. PLoS Comput Biol 4, e1000121.

    Article  PubMed  Google Scholar 

  22. Dinu V., Zhao H., and Miller P.L. (2007) Integrating domain knowledge with statistical and data mining methods for high-density genomic SNP disease association analysis. J Biomed Inform 40, 750–60.

    Article  PubMed  CAS  Google Scholar 

  23. Zhu Y., Shen X., and Pan W. (2009) Network-based support vector machine for classification of microarray samples. BMC Bioinformatics 10, S21.

    Article  PubMed  Google Scholar 

  24. Lancashire L.J., Lemetre C., and Ball G.R. (2009) An introduction to artificial neural networks in bioinformatics – application to complex microarray and mass spectrometry datasets in cancer studies. Brief Bioinform 10, 315–29.

    Article  PubMed  CAS  Google Scholar 

  25. Saksena A., Lucarelli D., and Wang I.J. (2005) Bayesian model selection for mining mass spectrometry data. Neural Netw 18, 843–9.

    Article  PubMed  Google Scholar 

  26. Conrads T.P., Zhou M., and Petricoin E.F., Liotta L., and Veenstra T.D. (2003) Cancer diagnosis using proteomic patterns. Expert Rev Mol Diagn 3, 411–20.

    Article  PubMed  CAS  Google Scholar 

  27. Petricoin E.F., and Liotta L.A. (2004) SELDI-TOF-based serum proteomic pattern diagnostics for early detection of cancer. Curr Opin Biotechnol 15, 24–30.

    Article  PubMed  CAS  Google Scholar 

  28. Schaub N.P., Jones K.J., Nyalwidhe J.O., Cazares L.H., Karbassi I.D., Semmes O.J., Feliberti E.C., Perry R.R., and Drake R.R. (2009) Serum proteomic biomarker discovery reflective of stage and obesity in breast cancer patients. J Am Coll Surg 208, 970–8.

    Article  PubMed  Google Scholar 

  29. Rogers M.A., Clarke P., Noble J., Munro N.P., Paul A., Selby P.J., and Banks R.E. (2003) Proteomic profiling of urinary proteins in renal cancer by surface enhanced laser desorption ionization and neural-network analysis: Identification of key issues affecting potential clinical utility. Cancer Res 63, 6971–83.

    PubMed  CAS  Google Scholar 

  30. Huang H., Li J., and Chen J.Y. (2009) Disease gene-fishing in molecular interaction networks: A case study in colorectal cancer. Engineering in Medicine and Biology Society, 2009 EMBC 2009 Annual International Conference of the IEEE 2009, 3.

    Google Scholar 

  31. Zhang F., and Chen J.Y. (2009) A neural network approach to developing multi-marker panels for breast cancer based on LC/MS/MS proteomics profiles. Proceedings of the 31st Annual International Conference of the IEEE Engineering in Medicine and Biology Society 2009.

    Google Scholar 

Download references

Acknowledgments

This work was supported in part by a grant from the National Cancer Institute (U24CA126480-01), part of NCI’s Clinical Proteomic Technologies Initiative (http://proteomics.cancer.gov), awarded to Dr. Fred Regnier (PI) and Dr. Jake Chen (co-PI). We thank Hui Huang and Jiao Li for providing a case study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fan Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Zhang, F., Chen, J.Y. (2011). Data Mining Methods in Omics-Based Biomarker Discovery. In: Mayer, B. (eds) Bioinformatics for Omics Data. Methods in Molecular Biology, vol 719. Humana Press. https://doi.org/10.1007/978-1-61779-027-0_24

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-027-0_24

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-61779-026-3

  • Online ISBN: 978-1-61779-027-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics