Skip to main content

Systematic and Integrative Analysis of Gene Expression to Identify Feature Genes Underlying Human Diseases

  • Chapter
  • First Online:
Transcriptomics and Gene Regulation

Part of the book series: Translational Bioinformatics ((TRBIO,volume 9))

  • 2505 Accesses

Abstract

Over the past two decades, the advances in genomics technology have opened the door for rapid biological data acquisition and have revolutionized many aspects of biomedical research. Given the complex and noisy nature of the large-scale biological data, there is a high demand for developing variable selection approaches to identifying disease biomarkers in the field of translational bioinformatics. These biomarkers offer early detection of pathogenesis, inform prognosis, provide guidance for the treatment, and monitor disease progresses. In this chapter, we focused on developing a variety of methods that systematically analyzed whole-genome gene expression data for identifying feature genes associated with patient clinical parameters. In the first method, we constructed a gene co-expression network and then selected genes that are informative for classifying different cancer subtypes based on gene connectivity within the co-expression network. In the second method, we incorporated prior biological pathway information to reconstruct a gene network and then identified hub genes that are associated with cancer prognosis. Finally, we identified protein subnetworks instead of individual genes as biomarkers for classifying different types of brain injuries. Our study has set up a framework that can be easily generalized to integrate different types of genomics and proteomics information for better identifying feature genes to improve accuracy of disease diagnosis and treatment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Majewski IJ, Bernards R. Taming the dragon: genomic biomarkers to individualize the treatment of cancer. Nat Med. 2011;17(3):304–12.

    Article  CAS  PubMed  Google Scholar 

  2. Kohavi R, John G. Wrappers for feature subset selection. Artif Intell. 1997;97:52.

    Google Scholar 

  3. Yu L, Liu H. Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res. 2004;5:20.

    Google Scholar 

  4. Dy FG, Brodley CE. Feature selection for unsupervised learning. J Mach Learn Res. 2004;5:45.

    Google Scholar 

  5. Law MH, Jain AK, Figueiredo M. Feature selection in mixture-based clustering. In: NIPS; 2002. p. 8.

    Google Scholar 

  6. Alelyani S, Tang J, Liu H. Feature selection for clustering: review. In: Aggarwal C, Reddy C, editors. Data clustering: algorithms and applications. Boca Raton: CRC Press; 2013.

    Google Scholar 

  7. Cawley GC, Talbot NL, Girolami M. Sparse multinomial logistic regression via bayesian l1 regularisation. In: Neural information processing systems. 2006.

    Google Scholar 

  8. Mitra P, Murthy CA, Pal S. Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell. 2002;24:12.

    Google Scholar 

  9. He X, Cai D, Niyogi P. Laplacian score for feature selection. Adv Neural Info Process Syst. 2006;18:8.

    Google Scholar 

  10. Golub T, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286(5439):531–7.

    Article  CAS  PubMed  Google Scholar 

  11. Wang Z, et al. Improving the sensitivity of sample clustering by leveraging gene co-expression networks in variable selection. BMC Bioinf. 2014;15(1):153.

    Article  Google Scholar 

  12. Wang Z, et al. Spectral feature selection and its application in high dimensional gene expression studies. In: Proceedings of the 5th ACM conference on bioinformatics, computational biology, and health informatics. ACM; 2014.

    Google Scholar 

  13. Wang Z, et al. Incorporating prior knowledge into Gene network study. Bioinformatics. 2013;29(20):2633–40.

    Article  PubMed Central  PubMed  Google Scholar 

  14. Wang Z, et al. A Bayesian framework to improve microRNA target prediction by incorporating external information. Cancer Info. 2014;13(Suppl 7):19.

    Article  CAS  Google Scholar 

  15. Strehl A, Ghosh J. Cluster ensembles—A knowledge reuse framework for combining multiple partitions. J Mach Learn Res. 2003;3:35.

    Google Scholar 

  16. Zhang B, Horvath S. Stat Appl Genet Mol Biol. 2005;4 (Article17).

    Google Scholar 

  17. Qiu P, Gentles AJ, Plevritis SK. Discovering biological progression underlying microarray samples. PLoS Comput Biol. 2011;7(4):e1001123.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  18. Witten D, Tibshirani R. A framework for feature selection in clustering. J Am Stat Assoc. 2010;105(490):14.

    Article  Google Scholar 

  19. Golub TR, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286(5439):531–7.

    Article  CAS  PubMed  Google Scholar 

  20. Alon U, et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA. 1999;96(12):6745–50.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  21. Dudoit S, Fridlyand J. A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol. 2002;3(7):RESEARCH0036.

    Google Scholar 

  22. Getz G, Levine E, Domany E. Coupled two-way clustering analysis of gene microarray data. Proc Natl Acad Sci USA. 2000;97(22):12079–84.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  23. Meinshausen N, Buhlmann P. High dimensional graphs and variable selection with the lasso. Ann Stat. 2006;34:27.

    Google Scholar 

  24. Kramer N, Schafer J, Boulesteix AL. Regularized estimation of large-scale gene association networks using graphical Gaussian models. BMC Bioinform. 2009;10:384.

    Article  Google Scholar 

  25. Parikh AP, et al. TREEGL: reverse engineering tree-evolving gene networks underlying developing biological lineages. Bioinformatics. 2011;27(13):i196–204.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  26. Meinshausen N, Bühlmann P. High-dimensional graphs and variable selection with the lasso. Ann Stat. 2006;1436–1462.

    Google Scholar 

  27. Tibshirani, R. Regression shrinkage and selection via the lasso, J Royal Stat Soci Series B. 1996;58:22.

    Google Scholar 

  28. Wang Y, et al. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet. 2005;365(9460):671–9.

    Article  CAS  PubMed  Google Scholar 

  29. Chen Y, Park B, Han K. Qualitative reasoning of dynamic gene regulatory interactions from gene expression data. BMC Genom. 2010;11(Suppl 4):S14.

    Article  CAS  Google Scholar 

  30. Gusev Y, et al. In silico discovery of mitosis regulation networks associated with early distant metastases in estrogen receptor positive breast cancers. Cancer Inform. 2013;12:31–51.

    PubMed Central  CAS  PubMed  Google Scholar 

  31. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. JR Stat Soc. 1995;57(1):289–300.

    Google Scholar 

  32. van de Vijver MJ, et al. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002;347(25):1999–2009.

    Article  PubMed  Google Scholar 

  33. Bayes A, Grant SG. Neuroproteomics: understanding the molecular organization and complexity of the brain. Nat Rev Neurosci. 2009;10(9):635–46.

    Article  CAS  PubMed  Google Scholar 

  34. Laird AR, et al. ALE Meta-analysis workflows via the brainmap database: progress towards a probabilistic functional brain atlas. Front Neuroinform. 2009;3:23.

    Article  PubMed Central  PubMed  Google Scholar 

  35. Zaldivar A, Krichmar JL. Allen Brain Atlas-driven visualizations: a web-based gene expression energy visualization tool. Front Neuroinform. 2014;8:51.

    Article  PubMed Central  PubMed  Google Scholar 

  36. Emes RD, et al. Evolutionary expansion and anatomical specialization of synapse proteome complexity. Nat Neurosci. 2008;11(7):799–806.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  37. Nagasaka Y, et al. A unique gene expression signature discriminates familial Alzheimer’s disease mutation carriers from their wild-type siblings. Proc Natl Acad Sci USA. 2005;102(41):14854–9.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  38. Gaetz M. The neurophysiology of brain injury. Clin Neurophysiol. 2004;115(1):4–18.

    Article  CAS  PubMed  Google Scholar 

  39. Albert-Weissenberger C, Siren AL. Experimental traumatic brain injury. Exp Transl Stroke Med. 2010;2(1):16.

    Article  PubMed Central  PubMed  Google Scholar 

  40. Chuang HY, et al. Network-based classification of breast cancer metastasis. Mol Syst Biol. 2007;3:140.

    Article  PubMed Central  PubMed  Google Scholar 

  41. Peri S, et al. Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 2003;13(10):2363–71.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  42. Stark C, et al. BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 2006;34(Database issue):D535–9.

    Google Scholar 

  43. Dong J, Horvath S. Understanding network concepts in modules. BMC Syst Biol. 2007;1:24.

    Article  PubMed Central  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yin Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Wang, Z., Xu, W., Liu, Y. (2016). Systematic and Integrative Analysis of Gene Expression to Identify Feature Genes Underlying Human Diseases. In: Wu, J. (eds) Transcriptomics and Gene Regulation . Translational Bioinformatics, vol 9. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-7450-5_7

Download citation

Publish with us

Policies and ethics